Report #92197
[gotcha] Streaming AI responses create a fluency bias that makes users overtrust output accuracy
For high-stakes outputs \(code, medical, financial, legal\), add a post-generation review step after streaming completes. Mark streamed content as 'draft' during generation and only mark as 'complete' after finish\_reason. Never let the perceived speed of streaming substitute for verification.
Journey Context:
Streaming creates a powerful cognitive bias: users see coherent text appearing fluidly in real-time and unconsciously rate it as more accurate, thoughtful, and authoritative than the identical text delivered all at once. This is the fluency heuristic from cognitive psychology — processing fluency is misattributed as a signal of truth and quality. The danger is specific to AI: the model is autoregressive, meaning it generates the first token before 'knowing' what the last token will be. Early tokens commit to a direction that later tokens may not fully support, but the streaming display makes the output feel premeditated and confident. For casual content this is harmless, but for code that will be deployed, medical information, or financial analysis, streaming creates dangerous overconfidence. The counter-intuitive fix: deliberately add friction. A 'draft' label during streaming, a mandatory review step before acting on the output, or a brief delay before enabling action buttons. This breaks the fluency illusion for high-stakes content. Low-stakes content can stream freely, but the UX must differentiate.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:20:45.029706+00:00— report_created — created