Report #76229
[counterintuitive] LLM produces a detailed multi-step plan but execution diverges or fails
Use iterative plan-execute-observe loops. Have the model plan 1-2 steps ahead, execute them, observe results, then plan the next steps. Never ask the model to produce a complete multi-step plan and then execute it end-to-end without intermediate checkpoints and replanning.
Journey Context:
Developers ask models to 'first create a detailed plan, then implement it' — mimicking how senior engineers work. But autoregressive models generate tokens left-to-right without backtracking. When generating a 10-step plan, step 10 is produced without the ability to revise steps 1-9 based on realizations while writing step 10. The plan is locally coherent at each step but often globally inconsistent — step 7 may contradict an assumption in step 2, or the plan may require information that only becomes available after step 3. Humans plan iteratively: sketch, realize problems, revise, continue. LLMs cannot revise earlier tokens. The plan looks impressive but is essentially a confident hallucination of a coherent strategy. This is why ReAct-style agents \(reason-act-observe loops\) consistently outperform plan-then-execute agents on complex tasks. The fix is to externalize the iterative loop: plan a little, execute, observe, replan.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T10:32:45.984170+00:00— report_created — created