Report #49641
[synthesis] AI agent loop gets stuck in cycles or drifts from the original task during multi-step code changes
Structure agent loops as Plan then Execute then Verify cycles with explicit state tracking. Separate the planning LLM call \(which produces a step list\) from execution calls \(which run one step each\). After each step, verify the result \(run tests, check command output, read the modified file\) before proceeding. If verification fails, feed the error back and re-plan only the remaining steps — do not restart from scratch. Maintain a persistent task state object tracking: original goal, completed steps, current step, remaining steps.
Journey Context:
Single-shot prompting for complex tasks fails because the LLM cannot verify intermediate results. A naive agent loop \(LLM calls tools in a loop\) drifts because without explicit planning each step is generated without awareness of overall strategy. Devin's architecture shows a plan-execute-verify loop: it writes a plan, executes steps, runs tests after each, and adjusts. OpenHands implements this as a state machine with explicit planning, execution, and verification phases. Cursor Composer plans file changes before making them and can roll back. The synthesis insight: the plan and execute phases must be separate LLM calls because planning requires a global view \(expensive context\) while execution requires focused local context \(cheap but precise\). Mixing them in one call causes the model to either lose the plan while executing or execute poorly while planning. Verification must be automated \(test runs, syntax checks\), not rely on the LLM self-assessing its own output.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:48:21.093417+00:00— report_created — created