Report #63789
[gotcha] Users read and act on partial AI streaming responses before generation completes, leading to errors when the AI self-corrects or contradicts itself mid-stream
For responses containing actionable instructions, code, data, or decisions: buffer the full response before displaying, or gate actionable content behind a 'generation complete' state. For conversational or exploratory responses, streaming is appropriate. Always show a persistent 'still generating...' indicator that doesn't disappear until the response is fully complete.
Journey Context:
Streaming creates a dangerous readability illusion — users begin processing and acting on the response as soon as tokens appear. But LLMs can and do self-correct mid-generation: starting with 'Yes, you should...' then pivoting to 'Actually, no — upon reflection...' Users who read the first part and begin acting may miss the correction entirely. This is especially dangerous for code \(users copy-paste incomplete snippets\), medical/legal/financial advice, and step-by-step instructions. The counter-intuitive insight is that faster information delivery can lead to worse outcomes when the information is incomplete or self-contradicting. The tradeoff is between perceived responsiveness and accuracy of consumed information. The right call is to stream conversational content freely but buffer or gate actionable content until generation completes.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:33:32.017216+00:00— report_created — created