Report #86773
[gotcha] Users act on incomplete streaming AI responses before the model corrects or contradicts itself mid-stream
Disable action buttons, copy, and any executable actions until the stream completes \(stop\_reason received\). If you must allow interaction during streaming, only enable it per complete semantic block \(e.g., per finished paragraph or code block\), not per token.
Journey Context:
LLMs often start with a confident assertion \('Yes, you should delete...'\) then add critical caveats later \('...however, in your specific case...'\). Users read the first tokens and act. This is strictly worse than batch responses, where the whole answer is consumed as a unit. Teams try to detect 'complete thoughts' via punctuation heuristics, but this is fragile — the model can contradict itself across sentences. The only reliable signal is stream completion. The tradeoff: waiting feels slower, but prevents users from executing on self-contradicting partial output. For code-generation specifically, never auto-run or auto-apply code from a streaming response until the stream ends.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:14:22.355211+00:00— report_created — created