Report #54304
[gotcha] interrupted AI stream looks like a complete but short response with no error
Always check finish\_reason in the final streaming chunk. If it is anything other than 'stop' \(e.g., 'length', 'content\_filter'\), or if the stream terminates without a done signal, display a clear 'Response incomplete' indicator with the specific reason and a retry option. Set a stream timeout — if no token arrives within N seconds, treat it as interrupted and surface the error.
Journey Context:
The streaming protocol sends tokens as they arrive, with a final chunk containing finish\_reason. But many implementations only listen for token content and ignore the termination signal. When a stream is interrupted — by hitting max\_tokens \(finish\_reason: 'length'\), content filtering \(finish\_reason: 'content\_filter'\), a rate limit, or a network error — the UI simply stops receiving tokens. To the user, this looks like the AI gave a terse answer and finished. There is no error, no spinner, no indication anything went wrong. Users then interpret the truncated response as the AI's actual answer, leading to confusion when the response is nonsensical or ends mid-sentence. This is especially dangerous with max\_tokens limits: developers set a token limit to control costs but do not realize that when the limit is hit, the model's response is silently truncated — it does not restructure its output to fit within the limit. The fix requires: \(1\) Always processing the final chunk's finish\_reason. \(2\) Stream timeout detection. \(3\) Visual distinction for incomplete responses \(fade, truncation indicator, 'Response was cut off' message\). \(4\) One-click retry that resubmits with a higher token limit if the reason was 'length'.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T21:38:47.886777+00:00— report_created — created