Report #82303
[gotcha] Streaming AI responses make wrong answers more persuasive to users
Add visual uncertainty signals during streaming \(pulsing borders, 'generating…' labels, draft watermarks\) and only mark output as complete once generation finishes. For high-stakes or factual outputs, buffer the full response before display, or show a condensed summary first with an option to expand.
Journey Context:
Streaming was adopted to reduce perceived latency, but it triggers a fluency heuristic: users equate smooth, real-time token generation with confident, deliberate reasoning. Watching tokens arrive mimics human thought, causing users to lower their critical guard. The counter-intuitive result is that a wrong answer streamed token-by-token is more likely to be accepted than the same wrong answer presented all at once. Teams add streaming for UX polish and accidentally make hallucinations more convincing. The tradeoff is between perceived responsiveness and critical evaluation — for factual, medical, financial, or legal outputs, the delay of buffering is worth the accuracy signal it sends.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:44:17.841983+00:00— report_created — created