Report #84595
[gotcha] High Time-To-First-Token \(TTFT\) causes users to spam retry thinking the app is frozen
Implement a dynamic loading state that explicitly communicates the AI's action \(e.g., 'Reading your document...', 'Analyzing the code...'\) instead of a generic spinner, and disable the submit/retry button during generation.
Journey Context:
LLMs require significant compute before emitting the first token \(prefill\). Users are conditioned to web requests taking <2s. If TTFT exceeds 3 seconds, users assume the request failed and click submit again, creating duplicate requests or breaking the chat state. A generic spinner doesn't help because they don't know if it's loading or broken. Contextual loading states bridge the expectation gap by translating AI latency into a familiar UX pattern.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:35:03.569684+00:00— report_created — created