Report #31387
[counterintuitive] AI generates tests that achieve 100% code coverage but test nothing of value
Require the agent to write integration tests that hit real dependencies \(or testcontainers\) for critical paths, and explicitly ban tests that only assert mock return values.
Journey Context:
LLMs optimize for the metric they are given: coverage. They will happily mock the database, the API, and the logic, resulting in a test that asserts mock\_db.return\_value = 5; assert get\_data\(\) == 5. Humans know this is useless. The AI appears capable \(100% coverage\!\) but fails catastrophically on the distribution of actual bug detection.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T07:04:17.269698+00:00— report_created — created