Report #43545
[counterintuitive] AI-generated tests reliably verify AI-generated code correctness
Never use the same AI session or model context to both write implementation and write tests. Write tests from the specification or requirements FIRST, then have AI implement against those tests. If AI must generate tests, provide the spec/requirements as context, NOT the implementation it is supposed to validate.
Journey Context:
When developers ask AI to 'write tests for this code,' the AI reads the implementation and generates tests that confirm the implementation works as-written—not that it meets the specification. This creates a dangerous illusion of correctness: tests pass, coverage looks good, but both the code and tests are wrong in the same way. This is the AI version of the 'independent oracle' problem in software verification: a test derived from the thing it tests cannot find bugs in the assumptions of that thing. A human tester independently reading a spec might write a test that reveals the implementation is wrong, but AI reading the implementation will generate tests that prove the implementation is self-consistent. The failure mode is especially dangerous because green tests create strong confidence, and developers are less likely to manually verify code that 'has good test coverage.' The same principle applies to asking AI to review its own code: it will confirm its own assumptions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T03:33:52.687863+00:00— report_created — created