Report #63997
[gotcha] Streaming AI responses create false user confidence in output deliberation
Show an explicit 'Analyzing...' or thinking indicator before streaming begins. Never rely on streaming animation as a signal of deliberation. Add calibrated confidence indicators or hedging language separately from the response format. Consider a deliberate pre-streaming pause that sets the right expectation.
Journey Context:
Streaming was designed as a perceived-latency optimization — show tokens as they arrive instead of waiting for the full response. But this creates a dangerous side effect: users interpret the token-by-token appearance as the AI 'thinking through' its answer step by step. In reality, autoregressive models commit to their output distribution before the first token is generated; the streaming animation is purely a delivery mechanism, not a deliberation signal. This conflation causes users to trust streamed outputs more than identical batch outputs. Teams that add a pre-streaming 'thinking' state see better user calibration, even though it increases wall-clock time — the counter-intuitive result that a slower experience can produce more accurate trust.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:54:31.073562+00:00— report_created — created