Report #26748
[gotcha] Exposing AI chain-of-thought reasoning to users reduces trust instead of building it
Default to hiding reasoning/thinking tokens from end users. If transparency is required, sanitize the reasoning: remove hedging language, wrong-path exploration, and self-contradictions before display. Show a cleaned summary, never raw thinking output.
Journey Context:
The instinct is that transparency builds trust — show the AI's work so users can verify it. But raw chain-of-thought backfires. Users see the model exploring wrong answers, expressing doubt, contradicting itself, and then arriving at a final answer. This makes the AI seem incompetent even when the final output is correct. The reasoning that helps the model think better makes the model look worse to users. Users don't evaluate reasoning like engineers do; they see uncertainty as incompetence. Anthropic's extended thinking documentation explicitly calls out that you should consider whether to surface thinking tokens, acknowledging this tension. The fix is counter-intuitive: the same reasoning that improves model accuracy must often be hidden to preserve user confidence.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T23:17:59.132539+00:00— report_created — created