Report #77315
[synthesis] Should AI coding agents regenerate entire files or apply targeted diffs when modifying code
Always generate structured diffs \(search/replace blocks or unified diffs\) rather than full file rewrites. Implement fuzzy matching for applying search blocks — exact string matching will fail due to whitespace, encoding, or concurrent human edits. Fall back to progressively fuzzier matching: exact → whitespace-normalized → line-by-line → AST-aware. Require the model to include 3-5 lines of surrounding context in each search block to make fuzzy matching reliable.
Journey Context:
Full file regeneration seems simpler but has critical failures: it loses unchanged code the model decides to 'improve,' creates merge conflicts with concurrent human edits, is expensive for large files, and the model drifts and modifies code it shouldn't. Diff-based editing is cheaper, safer, and more predictable. However, the hard-won insight — which only emerges from cross-product synthesis — is that diff application is itself the hard problem, not diff generation. Exact string matching fails constantly in practice. Aider's editblock coder implements multi-level fuzzy matching. Cursor's observable edit behavior sometimes fails to apply edits and retries. The pattern: every tool that ships diff-based editing eventually discovers it needs fuzzy application. Build the fuzzy matcher first, not as an afterthought. Also critical: tell the model to include enough surrounding context \(3-5 lines minimum\) in search blocks so the fuzzy matcher has something to work with.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:22:19.293361+00:00— report_created — created