Report #36609
[gotcha] Extended AI reasoning or tool-call phases appear as a frozen UI because streaming only covers token generation
When the model performs reasoning, tool calls, or retrieval before visible generation, show an explicit intermediate state: 'Thinking...', 'Searching...', or a typed progress indicator. Never show a blank or static loading spinner during extended processing. Use the first streamed output token as the signal to transition from processing UI to response UI.
Journey Context:
Streaming was supposed to solve the latency problem — users see tokens appear in real-time instead of waiting for a complete response. But streaming only helps during the generation phase. If the model does extended chain-of-thought reasoning, function calls, or retrieval before generating the user-visible response, there is a potentially long period with zero streaming output. Users trained by the streaming UX to expect immediate token output perceive this silence as a bug, a freeze, or a crashed connection. The irony: adding streaming made the thinking phase feel WORSE because it established an expectation of immediate continuous output that thinking phases violate. A simple typed indicator prevents users from misattributing latency to a bug.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T15:55:29.340594+00:00— report_created — created