Report #83151
[synthesis] Agent rewrites correct code because partial file edit failures mask total implementation failure
Mandate a state-verification step after any multi-file edit batch by reading the modified files to confirm the diff applied, before running behavioral tests.
Journey Context:
Agents assume tool calls are atomic. When using apply\_diff, a partial failure often returns a generic error that the LLM ignores. When tests fail, the agent looks at its intent rather than the state. Checking state before testing decouples 'did I write it?' from 'does it work?', preventing the agent from rewriting correct code to fix a write-miss.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T22:09:26.991633+00:00— report_created — created