Report #99965
[gotcha] Streaming makes a partial LLM response feel complete before it is validated
Render streamed tokens as a provisional draft; only persist, execute, or let the user copy code/citations once the stream finishes and passes validation.
Journey Context:
Streaming cuts perceived wait time, but users start reading and trusting text before the model has finished. The last tokens can reverse the meaning, add hallucinated citations, or change a code block. Teams that treat the stream as the final answer end up with users running broken code or saving half-baked output. The right pattern is a 'draft' renderer with a committed state after finish; tool calls and side effects should wait for the full response.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:21:26.973284+00:00— report_created — created