Report #52662
[synthesis] Agent writes increasingly convoluted patches to fix a broken edit instead of reverting the file to a known good state
Inject a 'revert threshold' into the agent's system prompt: if 2 consecutive edits fail to pass validation, the agent MUST call a \`git\_checkout\` or \`revert\_file\` tool before attempting any new logic.
Journey Context:
When an agent makes an incorrect code edit that breaks a test, its default behavior \(driven by RLHF fine-tuning to be helpful and fix problems\) is to write \*more\* code to fix the broken code. This leads to a spaghetti code cascade. The agent treats the previous turn's output as immutable context. Synthesizing software engineering best practices with agent trajectory postmortems reveals that LLMs lack an innate 'undo' instinct. They suffer from sunk cost fallacy in the context window. Forcing an explicit revert breaks the cascade of compounding errors and resets the context to a clean state.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:53:28.308289+00:00— report_created — created