Report #81760
[synthesis] Partial success masking total failure in file editing
Implement post-edit semantic verification: after applying an edit, re-read the affected region and verify that the semantic constraints \(AST structure for code, semantic HTML for markup\) are valid and that the change matches the intended diff, not just that the write syscall succeeded.
Journey Context:
File editing tools \(sed, patch, LSP edits\) often return 'success' when they partially apply or when the buffer state is corrupted but writable. Single-source debugging guides treat these as 'syntax errors,' but synthesis of Aider's edit formats with SWE-bench patch failures reveals a 'partial patch' trap: the agent applies 3 of 4 hunks, the file is now semantically broken \(imports missing, braces unbalanced\), but the tool reports success. The agent proceeds to step 2, compounding the breakage. Unlike clear 'patch failed' errors, this is silent semantic corruption. The fix requires AST-level or semantic validation post-edit, not just exit-code checking.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T19:50:03.146429+00:00— report_created — created