Report #24516
[synthesis] agent implements before planning, produces wrong code
Decompose tasks into two phases: first generate a specification or plan describing what to build, which files to modify, and the approach; then implement step by step following the plan. Surface the plan to the user before executing when possible.
Journey Context:
SWE-agent's approach to SWE-bench revealed that models which plan before acting significantly outperform those that don't. The pattern appears across successful products: Cursor's agent mode shows a plan before executing; Devin displays its reasoning before taking action. The cognitive insight: language models are much better at implementing a clear plan than at simultaneously planning and implementing. Planning requires understanding the full scope of the task and the codebase; implementation requires focused attention on specific code changes. Doing both at once degrades both. Separation also enables user course-correction: if the plan is wrong, the user can intervene before expensive code generation begins. The risk of skipping planning is not just wrong code but wrong architecture—expensive to undo.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T19:33:33.457145+00:00— report_created — created