![]() |
VOOZH | about |
Claude Code is deceptively capable. Point it at a codebase, describe what you need, and it’ll autonomously navigate files, write code, run tests, and iterate on failures.
👁 ImageThe problem is that it doesn’t know when to stop.
I’ve watched it burn thousands of tokens refactoring perfectly functional code because my prompt mentioned “clean architecture,” continuing execution long after the core task was done simply because there was always something that could be improved.
This is the completion problem in agentic AI. Systems optimized for autonomous execution fundamentally lack a clear “done” primitive. Claude Code doesn’t have built-in mechanisms to evaluate task completion, confirm with you, and exit cleanly. The cost isn’t theoretical. Runaway loops burn tokens, pollute context with tangential work, and leave tasks in ambiguous states that are difficult to audit later.
Ralph addresses this by adding exit gates, circuit breakers, and prompt-driven completion criteria. But tooling is only part of the story. In practice, prompt specificity is the dominant variable determining whether you get a clean three-iteration completion or a 20-loop spiral.
The Replay is a weekly newsletter for dev and engineering leaders.
Delivered once a week, it's your curated guide to the most important conversations around frontend dev, emerging AI tools, and the state of modern software.
The missing primitive in agentic AI isn’t capability. It’s completion detection.
Claude Code operates on a self-continuation heuristic: if something can be improved, keep going. That behavior is useful for exploration and refactoring, but it breaks down for bounded tasks with a clear definition of done.
The issue compounds when objectives are vague. If you tell Claude Code to “build a CLI tool” without specifying constraints or exit criteria, you’re effectively creating an unbounded search space. Should it add config file support? Edge-case error handling? Logging? Sorting flags? Authentication? Comprehensive test coverage?
All of these features are defensible. Nothing in the prompt signals completion. So execution continues until context limits, tool errors, or manual interruption force a stop:
That structural bias toward continuation is what makes agentic systems powerful. It’s also what makes them expensive when left unconstrained.
Ralph doesn’t replace Claude Code. Instead, it wraps it in explicit control structures designed to enforce termination.
At its core, Ralph implements:
Its loop flow is explicit:
prompt → plan → execute → evaluate → exit or continue
By default, Ralph enforces iteration ceilings and token constraints, ensuring that even poorly scoped tasks eventually halt. It won’t prevent vague prompts from generating unnecessary work, but it will prevent indefinite continuation.
To see how tooling and prompt specificity affect execution behavior, I ran the same objective three different ways:
Build a CLI tool that fetches GitHub repository stats (stars, forks, open issues) and displays them in a formatted table.
Setup: a fresh directory with no existing code. Each scenario started from the same baseline objective but used different orchestration and prompt detail.
Prompt:
Build a CLI tool that fetches GitHub repository stats and displays them in a formatted table.
Claude Code didn’t ask clarifying questions. It made autonomous decisions about language, structure, and scope.
With no guidance on stack or boundaries, it chose Node.js and:
--sort flagsIts execution flow looked like this:
npm init -yindex.js, table.js, and test.jsThen it stopped. There was no explicit completion marker. No structured signal indicating that the task was finished. Just a summary of what it built:
The result was a functional tool produced in roughly two minutes. But it included additional features that were never requested, and there was no explicit completion primitive. That’s manageable in a supervised session. It’s risky in an autonomous workflow.
Using Ralph, I initialized a new project and defined an exit protocol requiring this exact output:
RALPH_STATUS: STATUS: COMPLETE EXIT_SIGNAL: true
Prompt:
Build a GitHub stats CLI tool. Make it good.
Loop #1 ran for five minutes and forty-one seconds. It produced:
Only one command was actually requested. The agent also added user profile fetching, language breakdown analysis, and rate-limit checking. All of these features are plausible interpretations of “make it good,” but none were explicitly required.
Ralph correctly detected the exit signal. However, a permissions configuration issue triggered the circuit breaker during a subsequent loop attempt, halting execution after bash command denials:
The end result was a functional tool with substantial scope creep. Ralph prevented runaway continuation, but the vague objective had already expanded the implementation beyond what was requested.
In the third scenario, the prompt included precise requirements and verifiable exit conditions.
Requirements:
owner/repo formatExit conditions:
The prompt also included an explicit constraint: do not add features beyond the requirements.
Loop #1 completed in two minutes and ten seconds, about 62 percent faster than the vague prompt scenario. It created exactly two files: src/index.js and a corresponding test file. It used Node’s built-in https module instead of introducing external dependencies. Five tests covered input parsing, formatting, and error handling:
Execution was clean, focused, and aligned precisely with the defined criteria. There was no scope creep, and termination occurred immediately after the exit conditions were satisfied.
Across the three scenarios, two variables consistently shaped the outcome.
First, orchestration determines whether execution can run indefinitely. Ralph’s exit gates and circuit breakers provide structural guarantees that tasks will eventually halt.
Second, and more importantly, prompt specificity determines scope. Vague requirements expanded the implementation fourfold in a single iteration. Explicit exit criteria constrained the agent to the minimal viable implementation and reduced execution time significantly.
Ralph adds boundaries. It doesn’t define what “done” means. That responsibility still belongs to the person writing the prompt.
The completion primitive in agentic systems isn’t simply “run in a loop.” It’s “run in a loop with verifiable stop conditions.”
Agentic AI systems like Claude Code can execute complex development tasks autonomously. But defining completion is still a human design problem. Without explicit success criteria, agents default to continuation. With clear exit conditions, they terminate cleanly and efficiently.
In production workflows, completion isn’t automatic. It has to be engineered through both orchestration and precise prompt design.
TSRX adds first-class control flow, conditional hooks, and scoped styles to React via a TypeScript compiler extension — no new framework required.
Learn how to build a full React Native auth system using Better Auth and Expo — with email/password login, Google OAuth, session persistence, and protected routes.
Compare the top AI development tools and models of June 2026. View updated rankings, feature breakdowns, and find the best fit for you.
Learn how Bloom filters reduce database lookups for username availability checks while preserving correctness at scale.
Would you be interested in joining LogRocket's developer community?
Join LogRocket’s Content Advisory Board. You’ll help inform the type of content we create and get access to exclusive meetups, social accreditation, and swag.
Sign up now