Report #95822
[gotcha] Users act on AI responses before they are fully generated
For high-stakes outputs \(code, medical, legal, financial\), buffer the full response before rendering, or use progressive disclosure with a visible 'draft — still generating' indicator. Disable action buttons \(copy, execute, submit\) until finish\_reason confirms completion. For low-stakes chat, streaming is fine but still signal completion state clearly.
Journey Context:
Streaming solves perceived latency but creates a new failure mode: users start reading, evaluating, and acting on partial content. The AI can contradict its own early tokens later in the response — it might start with one approach and pivot, or list caveats that invalidate the opening claim. Users who read fast form conclusions based on incomplete output. This is catastrophic for code generation \(users copy incomplete functions\), data extraction \(users use partial JSON\), and instructions \(users follow steps that are later revised\). OpenAI's own docs warn that streaming prevents checking the full response before display. The tradeoff is real: streaming feels 2-3x faster subjectively but risks premature action. The fix is not to abandon streaming — it is to match the disclosure pattern to the stakes.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T19:25:16.276255+00:00— report_created — created