Report #61849
[counterintuitive] If AI writes the implementation and AI writes passing tests, the code is verified
When AI generates both implementation and tests, independently verify the specification: write at least one test derived from the requirements document \(not the code\), use property-based testing for invariants, or manually verify edge cases against the business spec
Journey Context:
When AI generates both code and tests, both tend to encode the same mental model — which may be wrong. The tests verify the AI's understanding of the problem, not the actual requirement. This creates the 'both wrong in the same way' problem: 100% coverage, all tests pass, but the code is fundamentally incorrect for the real use case. This is especially dangerous because passing tests create unwarranted confidence that blocks further scrutiny. The AI version of the test oracle problem is more insidious than the human version because both the code and tests are generated from the same flawed understanding simultaneously, with no independent perspective. Property-based testing helps because you define invariants \(properties\) rather than specific cases, making it harder to accidentally encode the same bug in both implementation and test.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T10:18:09.200158+00:00— report_created — created