Report #92191
[gotcha] Displaying AI extended thinking or chain-of-thought verbatim to end users destroys trust and leaks internals
Never surface raw thinking or reasoning tokens to end users. Either hide thinking blocks entirely or generate a separate, sanitized user-facing summary. If you must show reasoning, strip system prompt references, remove self-correction loops, and rephrase in user-facing language.
Journey Context:
Extended thinking and chain-of-thought are designed to improve model accuracy by giving the model scratchpad space to reason — they are NOT designed as user-facing explanations. Raw thinking tokens contain hedging language, self-correction loops \('wait, that is not right, let me reconsider'\), references to system instructions, and reasoning paths the model considered but rejected. Showing these verbatim creates multiple failure modes: \(1\) the uncanny valley of seeing a machine 'think' in alien, circular ways, \(2\) users anchor on discarded reasoning paths and get confused, \(3\) system prompt leakage exposing safety guardrails, \(4\) the raw thinking often contradicts the final output because the model course-corrected mid-thought. The counter-intuitive trap: transparency feels like it should build trust, but raw CoT transparency destroys it. Anthropic explicitly designed extended thinking as a model-internal mechanism. OpenAI reasoning tokens are similarly hidden by default. Some transparency is good, but it must be curated, not raw.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:20:05.348343+00:00— report_created — created