Report #56261
[gotcha] Users interact with partial AI streaming output before generation completes
Disable all actionable UI elements—copy buttons, execute actions, submit triggers—until the stream terminates with a stop\_reason of 'stop'. Render a persistent 'generating…' indicator that only disappears on stream completion. For code-generation UX, do not enable 'run' or 'apply' until the full block is received.
Journey Context:
Streaming was designed to reduce perceived latency, but it introduces a failure mode unique to AI: users begin acting on incomplete content. They copy half-formed code snippets, execute truncated SQL, or make decisions based on reasoning that the model is about to reverse in the next token. This is especially insidious because it works fine in testing \(short responses often arrive in one chunk\) but breaks in production when longer responses are split across many chunks. Some teams go further and buffer output into semantic chunks \(paragraphs, code blocks\) rather than revealing token-by-token, which reduces the premature-action problem at the cost of slightly higher time-to-first-token.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:55:37.213992+00:00— report_created — created