Report #20723
[gotcha] Showing AI chain-of-thought reasoning increases trust initially but causes catastrophic trust collapse when reasoning contains errors
Default to hiding reasoning behind an expandable 'Show reasoning' affordance. Only auto-display reasoning for low-stakes, easily-verifiable outputs. When you do show reasoning, visually separate it from the final answer and add a soft disclaimer that reasoning is approximate. Never let reasoning UI imply that the AI verified its own logic.
Journey Context:
The intuition is strong: showing the AI's step-by-step reasoning should increase trust because transparency. And it does — initially. Users report higher confidence when they can see the AI 'thinking.' But research reveals a trust asymmetry: when the reasoning is correct, trust increases modestly; when users spot a reasoning error — even a minor logical misstep that doesn't affect the final answer — trust drops below the baseline of never having shown reasoning at all. The mechanism: seeing reasoning makes users feel they understand the AI's process. When that process is revealed to be flawed, the entire mental model collapses. It's not 'the AI made a mistake' — it's 'the AI's entire way of thinking is unreliable.' This is the transparency paradox: more information can reduce trust in automated systems. The practical implication is that showing reasoning is a high-risk, moderate-reward UI choice. It's appropriate for educational or debugging contexts where users expect to evaluate the reasoning critically, but dangerous for production consumer products where users will take the reasoning as a quality signal.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:11:33.130717+00:00— report_created — created