AI Activity to Impact · May 27, 2026 · 9 min read

AI Agents Run Toward Goals. Are Yours Worth It?

Claude's /goal feature lets agents work until a completion condition is met. Most examples are output-oriented: tests pass, backlog empties. Outcome is missing.

Click image to open full size

Loop Engineering: from activities to outputs to outcomes

Loop engineering is how we get agency from AI. Prompt engineering got us through 2024, Context engineering in 2025 (and continues to be important) and now loop engineering is all the rage in 2026. Whether it’s development lifecycles managed as compounding/continuously improving loops, scheduled routines, or truly autonomous agentic workflows, these all rely on effective loop engineering.

What does a loop look like? the /loop and /goal capabilities you can find in agent harnesses such as Claude Code, Codex and Antigravity are good examples.

/Goal lets you set a completion condition and have the AI keep working across turns until the condition holds — a lightweight autonomous loop without you having to prompt each step. It is a genuinely useful step toward higher-agency AI.

But look at the canonical examples from Anthropic:

migrate an API until every call site compiles and tests pass
implement a design doc until all acceptance criteria hold,
split a large file
empty a labeled issue backlog.

Reading through this list, something jumped at me. Every single one of these /goals is output-oriented (or you can argue they are activity-oriented)).

Nothing in the example list asks whether a feature was actually adopted, whether a page works well for visitors, or whether a presentation landed with the audience.

When you give AI (and humans…) output-oriented goals, they tend to focus on the output. The result might be a working feature that nobody uses, a page that nobody reads, or a presentation that nobody understands.

To really unlock high-agency AI, we need to set outcome-oriented goals and instrument the system so that the agents can actually measure whether the outcome was reached.

What changed in Claude?

Anthropic recently shipped a capability called completion goals: you set a target condition with /goal, and Claude keeps working across turns until the condition is met. After each turn, a lightweight model checks whether the condition holds. If it does not, Claude starts another turn instead of returning control to you. No more nudging, re-prompting, or babysitting a multi-step sequence.

Loop Engineering - New to the AI frontier, a core practice in tackling complex systems

Seeking a goal this way is meaningfully different from a one-shot prompt. It is closer to delegating a problem to someone and telling them not to come back until it is done.

This is referred to as Loop Engineering. Engineering effective agentic loops that unleash the power of agents using a core concept in designing complex systems - using tight feedback loops to try, sense, and respond to seek a goal.

Ralph Loops

In my own agentic workflows for developing and evolving my web presence and delivery capabilities I have been building something similar manually — a Ralph loop script.

This is what these scripts do more or less:

Run an AI LLM with a prompt
Evaluate the result against an exit/success condition
Exit if the condition it met
Repeat the loop

The /goal feature moves that pattern into the harness itself, which makes it accessible to anyone without custom scripting.

What’s missing in the typical agentic loop

When Anthropic introduces a feature like this, the canonical examples they choose are telling. It’s not a coincidence that all the examples above are technical.

That list is not wrong — those are real, useful things to automate. But look at what is not on the list.

There is no example of “Landing page that converts” No “Feature that users find useful and are willing to pay for” No “Audience that learns something useful that sticks with them and changes their behavior from the presentation” No “Podcast that earns downloads and listens”

In other words, Nothing that ensures we build something that moves the needle.

The constraint AI agents are facing

Those absences are not accidental. They reflect a real constraint: AI agents can close the loop on technical correctness far more easily than they can close the loop on human behavior and value.

Whether tests pass is observable by a machine.

Whether people use a feature, whether a page works for real visitors, whether a talk lands — those require a fundamentally different kind of signal.

This is the core tension. Output is easy to measure inside the system. Outcome lives outside it, in the behavior and experience of the people you were trying to help.

When you set a completion goal around a technical criterion, the agent has clear stopping conditions it can evaluate autonomously and reliably. When you set one around an outcome, you immediately run into the question: how would the agent observe whether that condition holds? The agent can write the code. It cannot measure whether the code moved the metric you care about. It can publish the blog post. It cannot tell you whether anyone read it, thought differently as a result, or took a meaningful next step.

Why do we care? What’s wrong with focusing on outputs and deliverables?

Scaling output production is valuable. But the real goal isn’t activity or even output.

Organizations are looking for business impact. Revenue growth. Improved margins. Reduced Risk.

And pages, features, presentations, live artifacts, don’t necessarily connect to impact. The impact comes from creating leverage - fewer people (and other agents!) able to deliver better value, safer, happier.

The Agent Output Trap

Let’s simply shift to outcome-oriented loops

Isn’t the answer to simply adopt outcome oriented goals?

Here’s what my outcome framing skill suggested as possible outcome framing for Anthropic’s activity/output goals:

Instead of “migrate an API until it compiles and tests pass”: “Developers will be able to use the new unified API to access real-time inventory, resulting in [reduced checkout latency / fewer out-of-stock orders].”
Instead of “implement a design doc until all acceptance criteria hold”: “Customers will be able to manage their subscription plans self-serve, resulting in a 20% reduction in support tickets.”
Instead of “split a large file”: “Engineers will be able to modify the billing flow without risking conflicts in unrelated modules, resulting in fewer deployment rollbacks.”
Instead of “empty a labeled issue backlog”: “Users will be able to complete checkout without payment timeouts, resulting in a 5% increase in conversion rate.”

The Observability Gap

And as long as agents cannot see whether their actions and outputs are really helping, they cannot close a real feedback loop. The can spend a lot of tokens building tons of stuff, that is beautiful, well designed, works well, but useless.

If you want to give AI agents outcome-oriented goals, you need to solve a prior problem: how does the agent know whether the outcome was reached? This means instrumentation. It means closing the feedback loop between what AI produces and whether that production moved the needle. It means building the observability layer that lets a completion condition like “users adopted this feature” or “this content performs” actually be evaluated, not just assumed.

Most organizations do not have that instrumentation today — not for AI outputs, and often not for human outputs either. We track task completion, story points, PRs merged, tickets closed. We are much weaker on adoption rates, usage patterns, business metric movement, and the causal chain between what we built and what changed. AI acceleration makes this gap more expensive, not less. When the rate of output production increases dramatically, the cost of producing things that nobody uses or that fail to achieve their purpose scales with it.

The work of building new observability is not glamorous. It does not feel as exciting as shipping a /goal feature. But it is what separates the organizations that will use agentic AI to drive real impact from those who will use it to drive impressive-looking activity.

Closing the Agent Feedback Loop

What has to be true for outcome-oriented agent goals?

For outcome-oriented goals to become the norm in agentic AI, a few things need to happen. First, people need to learn how to frame outcome-oriented conditions effectively. Most of us default to output because that is what we have historically tracked and because outputs are easier to specify. Writing a good outcome condition requires clarity about what change you want to see in the world, not just what artifact you want produced.

Second, the feedback loop needs to be closeable by the agent. That means investing in the observability layer — instrumentation, telemetry, measurable leading indicators of value — so that a completion condition based on adoption or quality or business impact can actually be evaluated, not just claimed. For some domains this is a relatively short road; for others it is a multi-year engineering and organizational investment.

Third, the tooling needs to evolve to handle the parts of outcomes that still require human judgment or physical-world observation. Image generation, usability testing, real user feedback, business metric validation — these require either agent-accessible tools or explicit human checkpoints in the loop.

The organizations that get ahead of this will not just be faster at producing output. They will be able to aim that speed at the right targets and know when they have hit them.

Aim Agent Goals at Outcomes, Not Outputs

What should you ask before setting an AI goal?

You do not need to wait for full observability infrastructure to start reorienting your AI goals. The first move is to ask, for every goal you set: is this a completion condition for an output, or for an outcome? If it is output, is that output reliably connected to the outcome you actually care about, and do you have enough signal to know when it is not?

That question will quickly surface the gaps. It will show you where you are measuring task completion and calling it progress. It will point toward the observability investments worth making. And it will make visible the distinction between AI as an accelerant for activity and AI as a driver of actual impact.

The /goal feature is a real step forward. Use it. But the highest-leverage move is not to chase more efficient loops around output-oriented goals. It is to build the feedback systems that let you aim agentic AI at what actually matters.

What are you doing with completion goals? Are yours output-oriented or outcome-oriented — and what would it take to close the loop on a real outcome?

Watch the Update

The gap between what AI produces and whether it mattered is the next frontier. Build for that one.

Scaling AI Activity to Impact

Practical thinking on turning AI pilots, adoption, and portfolio work into business impact - by finding the constraint, changing the work, and proving value as you go.

Yuval Yeret helps product and tech leaders move from agile theater to evidence-informed delivery. Work with Yuval →

Keep reading

01 AI Made Engineering Faster. Why Not The Business? 16 min
02 Agentic AI: "Human in the Loop" Is Too Small 3 min