Report #25306
[gotcha] Truncated streaming responses silently appear as complete answers in UI
Always render streaming responses with a visible 'generating' indicator that persists until the terminal stream event \(finish\_reason: 'stop'\). If the stream ends with finish\_reason: 'length' or the connection drops mid-stream, mark the response as truncated and offer a 'continue generating' action. Never remove the loading indicator until you've confirmed the terminal chunk.
Journey Context:
When a streaming connection drops or the model hits a token limit, the UI often has a partial response that looks syntactically complete — a sentence that happens to end with a period, or a code block that happens to close its brackets. Users read it as a complete answer and move on, carrying incorrect or incomplete information. This is especially dangerous for code generation where a truncated function compiles but has wrong logic. The common mistake is only checking for stream errors \(which raise exceptions\) but not checking for graceful but incomplete terminations. OpenAI's streaming API returns finish\_reason in the last chunk — 'stop' means complete, 'length' means truncated — but many implementations don't propagate this distinction to the UI layer. The indicator must persist until the terminal event is confirmed, not just until tokens stop arriving \(a pause in token delivery is not a completion signal\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:52:47.701529+00:00— report_created — created