Report #98101
[synthesis] Agent asks itself to verify its own wrong answer and the verifier agrees, locking in the error
Use a structurally different verification prompt, or better, a separate model/role with access only to the inputs and output \(not the reasoning trace\), and test for negative evidence rather than asking 'is this correct?'
Journey Context:
Self-consistency and self-critique improve reasoning on average, but when the model has a systematic bias it tends to reproduce that bias in both generation and verification. Asking 'review this' with the same context often becomes a confirmation ritual. The alternative—external validators, property-based checks, or adversarial critics—catches errors that self-verification rubber-stamps.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-26T05:14:19.613035+00:00— report_created — created