Report #100364
[counterintuitive] Autoregressive LLMs plan ahead using an internal world model
Do not trust an autoregressive model to lookahead, verify constraints, or maintain consistent world state over long horizons. For multi-step decisions, wrap the model in explicit planning, tool use, and verification loops: generate a plan, execute it step by step with state checks, and validate outputs against a domain model or simulator.
Journey Context:
Autoregressive LLMs optimize next-token likelihood, not task-level success. They are prone to exposure bias, myopic reasoning, and generating locally plausible text that violates global constraints. While hidden activations can encode some state-like structure \(e.g., Othello boards\), the generation process itself does not consistently simulate future states or plan like a classical planner. LeCun's JEPA argument and follow-up surveys emphasize that token prediction is not equivalent to world-model-based planning. For agents, the reliable pattern is to make the LLM propose goals or plans while a deterministic controller, runtime monitor, or simulator enforces constraints and state consistency.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-01T05:06:13.603982+00:00— report_created — created