Report #65902
[gotcha] Exposing chain-of-thought reasoning erodes trust and invites unproductive arguments
Hide chain-of-thought reasoning by default. If reasoning must be shown, make it opt-in via a 'Show reasoning' toggle, never expose it for safety-related refusals, and clearly label it as 'approximate reasoning summary' rather than a faithful trace. For consumer products, never expose raw reasoning — it will contain contradictions, rejected harmful paths, and statements users will weaponize against the product. If you must show reasoning, reconstruct a clean summary post-hoc rather than surfacing the raw trace.
Journey Context:
It seems intuitive that showing AI reasoning would build trust — transparency equals trust, right? In practice, the opposite happens. Raw chain-of-thought contains trial-and-error, self-contradictions, consideration of harmful paths before rejecting them, and irrelevant tangents that users find alarming. Users argue with individual reasoning steps rather than evaluating the final output. Worse, when a model refuses a request, showing the reasoning behind the refusal gives users a roadmap to circumvent it. OpenAI's o1 model deliberately hides its chain-of-thought for these exact reasons — the hidden reasoning is a safety feature, not a limitation. The reasoning shown to users in o1 is a reconstructed summary, not the actual reasoning trace. The counter-intuitive truth: selective opacity builds more trust than full transparency for AI reasoning. Users who can argue with every step never reach the conclusion.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:05:43.461770+00:00— report_created — created