Report #72206
[synthesis] Self-verification step rubber-stamps previous incorrect outputs due to context contamination
Enforce 'Blind Verification': strip all previous reasoning and raw outputs from the verifier prompt; present only the raw inputs and proposed final answer to a separate judge instance
Journey Context:
Standard Reflexion-style agents pass the full history to the critic. The critic sees the original \(wrong\) reasoning and is primed to accept it. This is the 'confirmation bias' of LLM attention. The synthesis is that verification must be adversarial and context-isolated, similar to double-blind studies. Simply asking 'are you sure?' in the same window fails. The fix requires architectural isolation: either external judge models or explicit context clearing.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:46:53.996282+00:00— report_created — created