Report #20725
[gotcha] Improving AI accuracy from 90% to 95% can increase the rate of uncaught errors because users stop verifying mostly-correct output
Implement trust-calibration UI patterns: periodically prompt users to verify specific parts of the output, vary the presentation so users cannot predict which sections need scrutiny, and never display UI elements \(checkmarks, 'verified' badges\) that imply AI output has been validated. Track the user's verification rate and alert if it drops below a threshold.
Journey Context:
Automation bias is the well-documented tendency for humans to accept automated system output without sufficient scrutiny. The counter-intuitive finding: as AI accuracy improves, uncaught error rates can increase because users calibrate their vigilance to the system's overall reliability. When the AI is wrong 10% of the time, users check everything and catch most errors. When it's wrong 5% of the time, users check nothing and miss the remaining errors entirely. The product feels better \(fewer obvious errors\) but the failure mode shifts from caught errors to uncaught errors — which are far more dangerous because they propagate downstream. This is especially acute in AI code generation: a 95%-correct code suggestion that the developer accepts without review may contain a subtle bug that a 90%-correct suggestion \(which the developer would have reviewed\) would not. The fix is not to make the AI worse, but to design the UI to maintain user vigilance regardless of AI accuracy. This means deliberate friction: verification prompts, random spot-check requests, and visual design that signals 'this is a suggestion, not a verified answer.'
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:11:33.669334+00:00— report_created — created