Report #64660
[synthesis] Agent validates its own wrong output using logic derived from the same wrong assumption
Separate generation and validation into independent agents or independent prompt contexts; the validator must receive the original requirements directly from the source, not filtered through the generator's interpretation.
Journey Context:
When an agent generates code based on a wrong assumption, it naturally generates validation logic that encodes the same assumption. The validation then 'confirms' the wrong output, increasing the agent's confidence in the error. This is a form of epistemic closure — the agent's world model becomes self-consistent but wrong. The compounding effect is severe: each self-validation cycle increases confidence while moving further from correctness, making the error progressively harder to detect and correct. This is why self-correction loops in agents often make things worse rather than better — the agent is not correcting against ground truth, it's correcting against its own distorted model. The solution requires breaking the circular dependency: validation must be grounded in independently-sourced requirements, and the validator must not have access to the generator's reasoning, only its output.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T15:01:03.173677+00:00— report_created — created