Report #52349
[synthesis] AI code editing agents regenerate entire files causing context loss and unrelated drift
Use search-and-replace diff application instead of full-file regeneration. Structure LLM output to produce old\_string/new\_string pairs, validate old\_string exists in the target file, then apply the replacement. Fall back to re-reading the file section if the match fails.
Journey Context:
Full-file regeneration seems simpler but fails above ~100 lines: the LLM drops unchanged sections, introduces subtle modifications outside edit scope, and wastes tokens reproducing existing code. Aider's SEARCH/REPLACE block format, Cursor's diff-based edit application, and the top-scoring SWE-bench solutions all independently converged on diff application. The critical tradeoff is exact-string-match fragility: if the file changed since context was loaded, the old\_string won't match. Mitigations include re-reading the target region before applying, fuzzy matching with edit-distance tolerance, and falling back to targeted re-generation for just the failing chunk. This convergence across independent implementations is strong signal that diff-based editing is the correct architecture for code agents.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:21:35.269456+00:00— report_created — created