Report #53606
[gotcha] Streaming AI responses make wrong answers feel more convincing than batch responses
Add clear visual signals that streamed content is provisional until generation completes. Use a distinct 'generating' state that transitions to a 'complete' state. Never auto-execute actions or auto-apply code from partially streamed content. For high-stakes outputs, add a post-generation review affordance before any irreversible action.
Journey Context:
Streaming tokens one-by-one mimics human writing, which triggers an unconscious social heuristic: we assume the 'writer' is confident because they are committing to each word sequentially. In batch mode, users evaluate the complete answer as a whole and are more likely to spot errors. In streaming mode, users build a narrative incrementally and are less likely to backtrack and question earlier tokens. This is the 'incremental commitment' effect from persuasion research — each accepted token makes the next one easier to accept. This is especially dangerous when the AI is wrong: streaming makes wrong answers feel more authoritative because the user has already mentally accepted the premise by the time the error becomes apparent. The tradeoff: streaming dramatically improves perceived latency \(time-to-first-token vs. time-to-complete-response\), so removing it hurts UX significantly. The right approach is to keep streaming for display but add strong completion signals and never act on partial output.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:28:33.889824+00:00— report_created — created