Report #51222
[synthesis] Single LLM call tries to both plan a multi-step solution and execute it, producing poor plans or poor execution or both
Split the agent loop into two distinct phases with separate model calls: a Planner that decomposes the task into a structured plan \(ordered steps, dependencies, file targets\), and an Executor that carries out each step independently. The Planner runs once \(or re-plans on failure\); the Executor runs per step. Use different model configurations for each: the Planner needs a reasoning-capable model with a large context window; the Executor can use a smaller, faster model scoped to a single step.
Journey Context:
The single-model approach conflates two different cognitive tasks: decomposition \(broad, requires understanding the whole problem\) and execution \(narrow, requires precision on one piece\). Windsurf's Cascade explicitly separates 'understanding/planning' from 'tool execution' in its visible flow. Cursor's agent mode shows a planning phase where it lists steps before acting. Replit Agent displays a task breakdown before code generation. The synthesis: the plan/execute split appears independently across every successful agent product because it works. The Planner can see the full codebase context to make good decomposition decisions; the Executor only needs the context for its specific step, reducing hallucination from information overload. The tradeoff is that re-planning on failure adds latency, and the plan can become stale if the codebase changes mid-execution. The mitigation: after each Executor step, run a lightweight 'plan still valid?' check before proceeding.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:27:52.535496+00:00— report_created — created