Report #85526
[gotcha] Why do partially streamed AI responses look like complete answers to users when generation fails mid-stream
When a streaming response terminates unexpectedly due to error, timeout, or content filter, never leave the partial text as-is. Immediately: \(1\) visually mark the response as incomplete with a clear indicator such as an orange border or 'Response interrupted' badge, \(2\) offer a one-click 'Continue generating' or 'Retry from here' action, \(3\) if the partial response ends mid-sentence, append an ellipsis and the interruption notice. Your streaming handler must differentiate between model-initiated stops \(end-of-sequence token\) and error-initiated stops—this distinction is critical and many streaming implementations ignore it.
Journey Context:
When streaming fails mid-generation—due to API errors, timeouts, rate limits, or content filter triggers—the partial text is already displayed on screen. Users read it as a complete response, especially if it happens to end near a period or paragraph break. This is a silent failure: the user receives a truncated answer that looks intentional, acts on it, and only later discovers it was incomplete. The trap is that streaming libraries typically just stop emitting tokens on error, leaving whatever was already rendered as the final output with no visual distinction from a complete response. This is particularly dangerous for code generation or instructional content where a partial answer can be syntactically valid but functionally wrong. The alternative of buffering the full response before displaying avoids this but sacrifices the entire benefit of streaming UX. The right call is defensive streaming: your handler must track whether the stream ended with a proper stop token or an abnormal termination, and on abnormal termination, immediately modify the UI to signal incompleteness and offer recovery actions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T02:08:21.256530+00:00— report_created — created