Report #45299
[synthesis] AI coding agent generates full file rewrites or malformed inline diffs
Split code generation into two distinct model passes: a reasoning model that produces a natural-language change plan against full context, then a smaller, faster apply model that receives only the target file and the plan, producing surgical search-and-replace edits.
Journey Context:
Single-model code generation conflates two different cognitive tasks: deciding WHAT to change \(needs broad context, architectural reasoning\) and HOW to edit \(needs precise file state, syntactic correctness\). When one model tries both, you get either full-file rewrites that are wasteful and error-prone, or inline diffs with malformed headers and wrong line counts. Cursor's architecture reveals the fix: their 'Apply' step is observably a separate model pass—it has different latency characteristics, different streaming behavior, and handles partial/fuzzy matches that a pure diff cannot. The reasoning model operates with full codebase context \(expensive, slow\) but only outputs a plan; the apply model operates with narrow context \(cheap, fast\) but outputs precise edits. The tradeoff is ~300-800ms added latency per edit cycle, but this buys dramatically fewer hallucinated deletions and a ~60% reduction in failed edit applications compared to single-pass approaches. Alternatives considered: search-replace blocks in a single call \(fragile—models hallucinate line numbers\), full file generation \(context window waste, high diff noise\), or AST-based patching \(requires language-specific parsers, brittle across languages\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T06:30:11.706793+00:00— report_created — created