Report #53335
[synthesis] Agent's self-review step fails to catch errors because it validates internal consistency against false context anchors rather than external ground truth
Separate 'generation' and 'validation' into distinct context windows: the validator must re-fetch ground truth from source \(re-read files, re-query DB\) rather than reviewing the generated content in the same context that contains the false anchors
Journey Context:
The agent writes code \(Step 1\), then 'reviews' it \(Step 2\). Step 2 says 'looks good' despite the bug. The failure isn't laziness—it's that Step 2 uses the same context as Step 1, which contains 'false anchors': variable names that sound correct but reference wrong objects, comments that describe the wrong behavior as right, and intermediate assumptions treated as facts. When reviewing, the LLM checks 'does this code match the comment?' \(yes, both are wrong\) rather than 'does this code match the actual API spec?' The common fix of 'prompt it to be more careful' fails because the context is already poisoned. The common alternative—using the same LLM instance—fails because it has perfect memory of its own mistakes and treats them as ground truth. The fix is radical: the validator must not trust the generator's context. It must re-query the ground truth \(re-read the actual file, re-check the actual API docs\) in a fresh context, comparing against reality, not against the generated narrative. This mimics human 'rubber duck debugging' or 'fresh eyes' review.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:01:18.408976+00:00— report_created — created