Report #93662
[gotcha] AI streaming pauses mid-generation and users abandon the interaction thinking it crashed
Maintain perceived motion during generation stalls. Show pulsing cursors, skeleton animations, or explicit 'still generating...' indicators when token streaming pauses for more than 2-3 seconds. For reasoning-heavy models, use a two-phase UI: show a 'thinking' state with animation during reasoning, then stream the final answer. Never leave the UI completely static during an active generation.
Journey Context:
Streaming creates an expectation of continuous motion. When tokens flow, users perceive progress and the system feels responsive. But complex reasoning \(especially with models that 'think' before responding\) can cause multi-second pauses between tokens. The user sees the stream stop and assumes the system crashed or hung. The counter-intuitive finding: a non-streaming UI with a spinner for 30 seconds often feels LESS broken than a streaming UI that streams for 5 seconds then stalls for 25 seconds. The streaming start raised the expectation of continuous delivery, making the stall feel like a failure rather than normal processing. The fix is to maintain perceived motion even when token generation stalls — the UI should never appear frozen during an active request.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:47:45.210165+00:00— report_created — created