Report #97581
[counterintuitive] LLM fails to correctly execute a multi-step plan even with detailed instructions
Externalize state. Use a state machine, planner, or symbolic executor to track progress; let the LLM generate candidate actions or translate goals, not manage state internally.
Journey Context:
It's tempting to give an LLM a long plan and ask it to execute step by step. But LLMs are poor at tracking evolving state across many steps; error rates compound because each step has non-zero failure probability and the model cannot reliably update an internal world model. Procedurally generated reasoning tasks show performance collapses as state size grows. Planning and state tracking are outside the LLM's core competence; they are architectural, not prompt-level. The robust pattern is an LLM-in-the-loop controller with explicit state.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-25T05:21:57.969652+00:00— report_created — created