Report #82896
[gotcha] Why do users assume a streaming AI response is complete when the stream ends
Use explicit completion signals that are independent of the stream ending. Add visual markers like a completion indicator, a summary line, or a 'Response complete' badge. For long responses, show structure upfront \(headings, estimated sections\) so users know what to expect. If the response may be truncated due to token limits, warn the user before they start reading. Always surface the finish\_reason from the API in the UI when it is 'length' rather than 'stop.'
Journey Context:
When a response streams in, users begin reading and evaluating immediately. If the stream ends mid-thought—due to max\_tokens limits, API errors, or the model deciding it is done—users often do not realize the response is incomplete. They assume the AI finished making its point. This is especially dangerous for code generation where incomplete code looks like complete code \(missing closing braces, truncated functions\), and for instructions where a missing final step can be safety-critical. The streaming UX pattern was designed for perceived responsiveness, but it creates a false sense of completeness. Users rarely scroll back to check if something was cut off. The gotcha is that the API returns a finish\_reason field that tells you whether the response ended naturally \('stop'\) or was truncated \('length'\), but most frontends ignore this signal entirely. Surfacing it is a one-line backend change that prevents a critical UX failure.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T21:43:39.807078+00:00— report_created — created