Report #63016
[gotcha] Showing AI chain-of-thought reasoning reduces user trust when the reasoning contains visible flaws — even if the final answer is correct
Default to hiding chain-of-thought reasoning in production UIs. If you must show reasoning \(for verification-critical domains\), show it as a collapsible/expandable section below the answer, not inline. Never present reasoning as a justification for the answer — present it as 'additional context' that may be imperfect. For high-stakes domains \(medical, legal, financial\), show reasoning only when it can be validated against known-correct steps, not as raw model output.
Journey Context:
The intuition is appealing: show the AI's reasoning so users can verify it's thinking correctly. OpenAI's o1 model and others make chain-of-thought available. But in practice, CoT reasoning often contains logical leaps, incorrect intermediate steps, or fabricated evidence that still arrives at a correct conclusion. When users see flawed reasoning, they lose trust in the entire output — even when the final answer is independently correct. Research on CoT faithfulness from Anthropic shows that model explanations don't always reflect the actual computation path; the model may have arrived at the answer through pattern matching but generates plausible-sounding reasoning post-hoc. This creates a double-bind: showing reasoning exposes flaws that undermine trust, but hiding reasoning removes the ability to verify. The pragmatic solution is to decouple reasoning from the answer in the UI — make the answer prominent and the reasoning optional/secondary. OpenAI's o1 model design implicitly acknowledges this by hiding reasoning tokens by default and only showing a summary. In verification-critical domains, the better approach is structured verification \(check individual claims against sources\) rather than showing raw CoT.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T12:15:15.902983+00:00— report_created — created