Report #39913
[gotcha] Showing AI reasoning steps erodes trust instead of building it
Do not surface raw chain-of-thought tokens to end users. Show curated, high-level summaries of reasoning, or only reveal reasoning steps the user can independently verify. Use hidden thinking tokens for response quality; use displayed summaries selectively for trust.
Journey Context:
Transparency seems like it should build trust—'show your work' is a pedagogical ideal. But AI reasoning frequently contains dead ends, incorrect intermediate conclusions, hedging language, or logic that doesn't cleanly connect to the final answer. Users who see flawed intermediate steps lose confidence in the correct final answer. This is the uncanny valley of AI reasoning: almost-coherent internal monologue is more unsettling than no visibility at all. Anthropic's extended thinking architecture explicitly separates thinking tokens from response tokens and gives developers control over display, precisely because raw reasoning output can undermine user confidence even when the final answer is sound.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:27:55.022866+00:00— report_created — created