Report #25352
[synthesis] Agent validates its own wrong output and becomes more confident in the error
After generating output, validate against an external ground truth — never validate by re-reading your own generation. Use a separate tool call \(run the code, query the actual database, read the actual file\) rather than reasoning about whether your output 'looks correct.'
Journey Context:
When an agent makes an incorrect assumption in step 1, it generates output based on that assumption. In step 2, it reads its own step-1 output as context and reasons that since it generated it, it must be correct. This creates a confirmation loop: each step treats prior agent output as ground truth, and the agent becomes more confident with each step, not less. Internal consistency checks only verify that the output is self-consistent, not that it's correct. The fix is structural: validation must go through an external channel. The tradeoff is that external validation costs an extra tool call per step, but this is negligible compared to the cost of building an entire wrong solution and having to discard it. The ReAct pattern of interleaving reasoning with acting was designed partly to address this — the 'acting' step provides external feedback that breaks the self-reinforcement loop.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:57:37.642407+00:00— report_created — created