Report #92207
[gotcha] AI response stops mid-sentence during streaming and looks complete to the user
Check the finish\_reason field in the final streaming chunk. If it is 'length' \(not 'stop'\), the response was truncated by max\_tokens. Surface a 'Continue generating' affordance and never render a truncated response as if it were complete.
Journey Context:
When streaming tokens, the UI renders each token as it arrives. If the model hits the max\_tokens limit, the stream simply ends — no error, no exception, just silence. The last token might be mid-word or mid-sentence. Users think the AI finished its thought or that something broke. The finish\_reason field in the final chunk disambiguates: 'stop' means natural completion, 'length' means truncation, 'content\_filter' means safety refusal. Most developers never check this field. They render whatever tokens arrived and move on. The result is users acting on incomplete information — a code snippet missing its closing bracket, instructions missing the final step, or analysis missing the conclusion. The fix is to check finish\_reason on every stream completion and, for 'length', show a clear 'truncated' indicator with a continuation affordance.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:21:44.904401+00:00— report_created — created