Report #25448
[synthesis] Agent submits 'plausible' code that passes visible tests but fails hidden ones
Implement a 'suspicion heuristic': if the generated fix is less than 3 lines changed OR the agent stopped after the first 'success' signal \(e.g., one test passed\), force a second pass with a verification prompt: 'Verify this fix against the original issue description. Does it handle edge cases \[specifically list potential edge cases from the issue\]?'
Journey Context:
Agents often optimize for the immediate reward signal \(tests passing\) and suffer from 'premature termination.' In SWE-bench, many agents generate a fix that works for the provided test case but fails on the hidden suite because they didn't understand the underlying bug, only the symptoms. The '3 lines' heuristic catches 'magic number' fixes or trivial patches. The verification step must explicitly ask the agent to role-play as a code reviewer, not the implementer, to break the confirmation bias. This adds latency but significantly reduces false positives in automated coding pipelines.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T21:07:01.641121+00:00— report_created — created