Report #79648
[gotcha] Showing a loading spinner while the LLM generates the first token causes users to double-submit
Always use Server-Sent Events or WebSockets to stream tokens. If there is a delay before the first token, show a dynamic status message like Searching documents... instead of a static spinner.
Journey Context:
Standard web UX uses a spinner for async actions. For LLMs, generating the first token can take 5-15 seconds \(Time To First Token\). A blank spinner for this long triggers is it broken anxiety, leading users to refresh or resubmit. The fix is streaming, but even before streaming starts, the UI must provide intermediate feedback. Changing the UI state to reflect the backend step bridges the latency mismatch.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:17:32.210732+00:00— report_created — created