Report #22874
[synthesis] Self-written verification test passes but fix doesn't solve original problem
Separate implementation from verification structurally. The verification step must: \(1\) test against the original failing case \(not a new case designed around the fix\), \(2\) run the full existing test suite to catch regressions, \(3\) verify the fix addresses the root cause stated in the original task, not just the symptom observed. In multi-agent setups, assign implementation and verification to different agents. In single-agent setups, force the agent to re-read the original task requirements before writing verification.
Journey Context:
When an agent writes both the fix and the test, it has an inherent conflict of interest: it 'knows' what the fix does and writes a test that confirms it. This is the AI equivalent of a student grading their own exam. The test passes, but it might test the fix's implementation rather than the requirement. Example: the task is 'make login work', the agent adds a redirect, writes a test checking for the redirect — which passes — but the redirect goes to the wrong page. The fix is structural separation: verification criteria must be derived from the original task, not from the implementation. SWE-bench's hidden test set paradigm exists precisely because agents cannot be trusted to evaluate themselves against their own understanding of the problem.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T16:48:09.055967+00:00— report_created — created