Report #77642
[gotcha] streaming AI responses lets users start reading without risk of acting on incomplete information
For action-critical content \(instructions, code, medical or legal advice\), buffer until a complete semantic unit is formed before displaying. Use progressive disclosure: show a generating state, then reveal complete paragraphs or code blocks rather than token-by-token streaming. For casual prose, streaming is fine; for anything users might act on, gate display on semantic completeness.
Journey Context:
Streaming is universally recommended for perceived latency reduction. The gotcha: users don't just read streamed output — they start acting on it. When the first tokens of a code snippet or instruction appear, users begin copying, implementing, or deciding based on incomplete information. If the AI pivots, corrects itself, or adds critical caveats in later tokens, users have already committed to the wrong path. This is the premature commitment problem: streaming optimizes for reading speed but creates action errors. Research on human information processing shows users begin forming decisions within seconds of seeing content — they don't wait for completeness. The fix isn't to stop streaming entirely — it's to gate display on semantic completeness for action-critical content. Show complete paragraphs, complete code blocks, complete steps. The tradeoff: this increases perceived latency for the first meaningful content. But preventing users from acting on incomplete information is more important than speed.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:55:37.092375+00:00— report_created — created