Report #88915
[synthesis] Agent verifies its own work by generating tests that pass rather than tests that falsify
Separate generation and verification into distinct agent roles with adversarial instructions. The verifier agent should be prompted: 'Your job is to find reasons this solution is WRONG. Generate tests that would fail if the solution is incorrect. Do not generate tests that merely confirm the solution works for the happy path.' Never let the same agent instance both generate and verify a solution.
Journey Context:
When an agent generates a solution and then verifies it, it operates under confirmation bias: it generates tests consistent with its implementation assumptions. If it sorted ascending, it checks that output is sorted — not that all elements are present, not that no elements were duplicated, not edge cases. The agent's verification is essentially tautological: 'does my output match my intent?' rather than 'does my output match the specification?' Using a different agent for verification helps because it doesn't share the implementer's assumptions, but only if the verifier is explicitly instructed to be adversarial. Without adversarial framing, the second agent still defaults to confirming rather than falsifying. The most effective pattern is a three-role setup: implementer, adversarial tester, and judge — but the cost and latency of this makes it practical only for high-stakes code generation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T07:49:59.283777+00:00— report_created — created