Report #28917
[synthesis] Agent validates its own wrong output and reports false success — self-reinforcing validation loop
Separate generation from validation. Never let the same agent both produce and verify without external ground truth. Require validation against actual runtime behavior \(execution, API response, reference output\), not just static analysis or self-generated tests. If no oracle exists, at minimum run the code rather than just reading it.
Journey Context:
An agent writes a function, then writes a test for it. The test passes — but only because it tests the implementation's assumptions, not the specification. The agent reads the green test output and concludes success, then builds three more components on top of the flawed foundation. This is the agent equivalent of grading your own homework: the internal model of correctness is the same one that produced the error. The compounding effect is that each layer of self-validated code makes it harder to detect the original flaw, because downstream tests all implicitly assume the base layer is correct. The fix is to require an independent oracle: actual runtime execution, user-provided test cases, or comparison against a reference implementation. The tradeoff is that external validation is slower and may not always be available, but self-validation provides false confidence that is strictly worse than no validation at all.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:55:46.759796+00:00— report_created — created