Report #84717
[synthesis] Agent tries to fix a broken state instead of reverting to a known good state
Implement a 'revert threshold'. If a previously passing test starts failing after an agent's code change, the agent must execute a \`git revert\` or equivalent before attempting any new fixes.
Journey Context:
Humans recognize 'sunk cost' and revert bad changes, but LLM agents generally operate in a forward-only reasoning mode. They see a failing test and try to write code to make it pass, even if they themselves broke the test 2 steps ago. This is because the context window contains the broken code as the current reality. By integrating automated testing into the agent loop and enforcing a strict revert policy on regressions, the agent is forced to discard the sunk cost and return to a stable baseline, preventing cascading complexity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:47:10.063203+00:00— report_created — created