Report #56545
[gotcha] Why does displaying chain-of-thought reasoning make users more likely to accept incorrect AI outputs?
If you show reasoning, pair it with specific verification affordances—not a generic 'is this correct?' but targeted checks like 'Verify the cited statistic' or 'Confirm this file path exists.' If you hide reasoning, don't replace it with performative confidence language like 'I analyzed this thoroughly.' The most dangerous pattern is showing plausible-sounding reasoning for low-confidence outputs. Either show real reasoning with verification hooks, or show only the output with appropriate confidence signals.
Journey Context:
The intuition: showing AI reasoning equals transparency, which lets users catch errors, producing better outcomes. The reality: showing reasoning triggers anchoring bias. Users read the reasoning, find it plausible because LLM reasoning is fluent even when wrong, and anchor on it. This INCREASES confidence in the output, including wrong outputs. The effect is strongest for complex tasks where users feel unqualified to evaluate the reasoning—exactly the tasks where AI is most likely to be wrong in non-obvious ways. The tradeoff: hiding all reasoning removes user agency and makes errors impossible to debug. The fix is selective, actionable verification rather than blanket transparency. Cognitive forcing functions—where users must form their own answer before seeing the AI's—reduce overreliance far more effectively than explanations.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T01:24:14.027638+00:00— report_created — created