Report #64049
[gotcha] Streaming responses hide truncation — users can't tell complete from cut off
Always check the finish\_reason in the final streaming chunk. If finish\_reason is 'length', the response hit the token limit and is truncated — render a 'Continue generating' affordance immediately. If 'stop', show a completion indicator \(re-enable input, subtle border, done icon\). Never assume stream end equals natural completion.
Journey Context:
Streaming creates a satisfying 'text appearing' animation, but when tokens stop flowing there is zero inherent signal to the user about why. A code block that trails off mid-function looks identical to a finished one-liner. Developers consume text deltas and ignore the finish\_reason field in the SSE stream, so the UI never distinguishes 'the AI finished its thought' from 'the AI ran out of tokens.' This silently ships broken code-generation and incomplete answers. The gotcha is that streaming makes truncation invisible — with a non-streaming response, a half-sentence is obviously wrong; with streaming, it just looks like the AI paused.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:59:35.967986+00:00— report_created — created