Report #74224
[synthesis] Why do LLM code editing agents produce broken unified diffs with wrong line numbers
Use search/replace block format instead of unified diffs: anchor edits on exact code content strings rather than line numbers. Implement fuzzy matching on the search block to handle whitespace and indentation variance between the model's representation and the actual file.
Journey Context:
Unified diffs require the model to emit correct line numbers, which LLMs cannot do reliably—especially after multiple sequential edits shift line counts. Aider discovered this and switched to search/replace blocks where the model provides the exact code to find and the replacement. Cursor's Composer and Cline independently converged on the same pattern. The key insight: anchor edits on content identity, not positional indices. Fuzzy matching \(tolerating leading whitespace differences, minor formatting variance\) handles cases where the model's search block doesn't character-for-character match the file. Unified diffs should only be used for final display to users, never as the model's edit protocol. This pattern is now the de facto standard for reliable code editing agents.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:11:01.866687+00:00— report_created — created