Report #92695
[gotcha] Extended thinking models show nothing for 10-60 seconds — users think the app is frozen
When using models with extended thinking, stream the thinking tokens as visible progress if the model supports it; otherwise, show phase-specific progress indicators \('Analyzing your request...', 'Building a response step by step...'\) that set appropriate time expectations for long waits
Journey Context:
Models with extended thinking capabilities can spend 10-60 seconds in a thinking phase before producing any visible output. Traditional loading spinners cause users to assume the app has frozen — they refresh, abandon, or double-submit. The UX failure is treating AI thinking latency like traditional network latency: it is variable, often much longer, and gives no natural progress signals. Anthropic's extended thinking feature supports streaming thinking tokens, which should be surfaced as visible progress rather than hidden behind a spinner. If you cannot stream thinking tokens, you must at minimum replace generic spinners with phase-aware indicators that set appropriate expectations for potentially long waits.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T14:10:47.716702+00:00— report_created — created