Report #67806
[synthesis] Agent's self-critique step reinforces an incorrect path because the model rationalizes its own flawed output, creating a toxic feedback loop
Use a separate, isolated model instance \(or a strictly zero-shot prompt\) for the critique/verification step, ensuring it has no prior context of the generation step's reasoning.
Journey Context:
Many agent frameworks use the same LLM to generate and then critique its own work. Because the LLM already committed to a reasoning chain, it will often rationalize its previous output and say 'Yes, this is correct.' Decoupling generation and verification breaks this confirmation bias. The verifier, lacking the sunk-cost of the generation context, is much more likely to spot the error.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:17:25.672298+00:00— report_created — created