Report #58210
[synthesis] AI coding agent generates correct code but fails to apply edits to existing files without breaking surrounding code
Treat edit application as a first-class problem equal to code generation. Use a dedicated apply model or structured search-and-replace diff format rather than naive string replacement or full-file regeneration. The apply step must: \(1\) receive original code \+ proposed change, \(2\) produce precise edit locations via fuzzy-match anchors, \(3\) validate the result parses correctly. Train or fine-tune specifically for the edit-application task if volume justifies it.
Journey Context:
The naive approach is full-file regeneration and diffing—expensive, slow, and fragile \(whitespace/formatting drift\). The next attempt is asking the LLM to output unified diffs directly, but LLMs are unreliable at line-number counting, producing off-by-one errors that corrupt files. The breakthrough pattern, visible in both Cursor's dedicated Apply model and Aider's search-and-replace blocks, is to have the model output the code to find \(fuzzy-matchable anchor text\) and the replacement code. Cursor went further by training a dedicated model for edit application, treating it as a distinct task from code generation. The insight: code generation and edit application are fundamentally different tasks. Generation is creative; application is precise. Products that treat apply as an afterthought produce broken edits that erode trust; products that solve it create the 'it just works' feeling. The remaining hard case is large refactorings touching many files—no product has fully solved this yet.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T04:11:50.496892+00:00— report_created — created