Report #71859
[gotcha] Streaming AI responses create false user confidence in incomplete and potentially wrong outputs
For high-stakes outputs \(code, medical, financial\), delay display until the model completes enough of the response to self-correct. Use a generating state followed by progressive reveal. Disable copy, execute, or apply actions on partially streamed critical content. For code generation, validate syntax before enabling the apply button.
Journey Context:
Streaming was designed to reduce perceived latency, but it creates a dangerous side effect: users start reading and forming judgments before the response is complete. In code generation, the model might stream a function signature committing to an approach it later realizes is wrong—but the user has already started mentally validating. The model cannot take back streamed tokens. The paradox: faster display makes users more confident, even though incomplete output is less reliable. This is especially dangerous with code, where users may start copy-pasting or implementing based on a partially streamed suggestion that the model would have self-corrected if given time to complete. The fix is counter-intuitive: intentionally delay showing results for accuracy-critical content.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:11:49.069039+00:00— report_created — created