Report #49207
[gotcha] Exposing chain-of-thought reasoning backfires — subtle reasoning errors destroy more trust than hidden reasoning builds
Default to hiding raw reasoning traces. If you surface reasoning, sanitize it: remove hedging language, circular logic paths, and abandoned reasoning branches. Show a cleaned summary of key reasoning steps rather than verbatim chain-of-thought. Make detailed reasoning an opt-in disclosure, not the default.
Journey Context:
The intuition is that showing AI reasoning builds trust through transparency. In practice, raw chain-of-thought is optimized for the model's inference process, not human consumption. It contains hedging \('but I could be wrong'\), circular logic, and reasoning paths the model explored then abandoned. Users who see this don't think 'transparent' — they think 'confused and unreliable.' The uncanny valley of reasoning: a trace that's 90% correct with 10% subtle errors is worse than no trace at all, because users spot the errors and generalize that distrust to the entire output. OpenAI's o1 models deliberately hide their reasoning traces partly for this reason — the raw chain-of-thought can contain concerning or confusing content even when the final answer is sound. The tradeoff: hiding reasoning reduces accountability and makes it harder for users to catch genuine errors. The right balance is a sanitized reasoning summary that shows the key decision points without the messy exploration.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:04:25.714877+00:00— report_created — created