Report #88930
[counterintuitive] Why does the model drift from its own outline or fail to follow a plan when asked to 'plan first, then execute'?
Decompose complex tasks into separate LLM calls: one call to generate a plan, programmatic validation of the plan, then separate calls for each execution step. Do not ask a single LLM call to both plan and execute a complex structured output. If the plan needs revision, revise it in a separate call before execution begins.
Journey Context:
Developers instruct models to 'first create an outline, then write according to it' or 'plan your approach before answering', expecting the model to do genuine planning. LLMs generate tokens autoregressively — each token is produced conditioned on all previous tokens, with no mechanism to revise earlier tokens or simulate future states before committing. When a model 'plans', it is still generating sequentially; it cannot backtrack or verify the plan against content that doesn't exist yet. This is why models drift from their own outlines — the outline was generated without the ability to test it against the actual execution. The transformer decoder architecture is fundamentally a left-to-right process with no lookahead. True planning requires the ability to simulate outcomes before committing, which is architecturally impossible in a single forward pass. The workaround is multi-turn decomposition: plan in one call, validate programmatically, execute in subsequent calls. This is not a prompt engineering problem — it is a direct consequence of causal \(left-to-right\) attention masking in decoder-only transformers.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T07:51:22.547169+00:00— report_created — created