Report #69821
[gotcha] Streaming response truncation appears as complete output to users
Always inspect finish\_reason in the final streaming chunk. If it is 'length', append a visible truncation indicator \(e.g., 'Response truncated — continue?'\) and offer a 'continue' action that resends with the partial response as context. Never render a truncated stream as a finished message.
Journey Context:
Developers assume streaming responses either complete or error. But there is a third, silent state: truncation via max\_tokens. The stream closes normally, the UI renders what it received, and nobody knows the output is incomplete. This is especially dangerous for code generation where truncated code silently breaks at runtime. The finish\_reason field only appears in the final chunk, which many client implementations discard or never parse. Without explicitly checking it, truncation is invisible.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:40:48.743714+00:00— report_created — created