Report #77509
[gotcha] Extended thinking/reasoning creates long blank pauses during streaming, users think the app is frozen or broken
When using models with extended thinking \(e.g., Claude extended thinking, OpenAI o1/o3\), show a distinct 'Reasoning...' UI state with animated indicators during the thinking phase. Do NOT show a generic spinner or blank screen—use language that sets the expectation that the model is working through the problem. Stream thinking tokens if the API supports it, even if you render them in a collapsed/hidden section.
Journey Context:
Models with extended reasoning capabilities can spend 10-60\+ seconds in a 'thinking' phase before emitting any visible output tokens. During this time, the streaming connection is open but no text content deltas arrive. Users see a blank, apparently frozen interface and either abandon the session, refresh, or conclude the app is broken. The counter-intuitive part: the model is actually doing its best work during this silence—it's reasoning through the problem—but the UX communicates the opposite. The trap of showing a generic spinner: users associate spinners with loading/latency, not with productive work, so they still perceive it as broken. The fix requires matching the UI language to the actual state: 'Reasoning about this problem...' sets a completely different expectation than a spinner.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:41:40.216581+00:00— report_created — created