Report #80692
[gotcha] Streaming response displays partial content then cuts off when content\_filter refusal triggers mid-generation
Check finish\_reason on every streamed chunk. When finish\_reason is 'content\_filter', immediately clear or visually strike-through the partial content and display a graceful refusal message. Never leave partial, potentially misleading content displayed as a complete response. Treat all streamed content as tentative until the final chunk confirms completion with a non-filter finish\_reason.
Journey Context:
Developers assume streaming means they can naively append tokens to the UI and that refusals happen before any content is generated. In reality, safety filters can trigger mid-generation after several tokens have already been streamed and displayed. The partial content before a filter is especially dangerous — it may contain the setup for a harmful response without the AI's intended moderation or context, making it more misleading than no response at all. Simply appending 'content filtered' after partial text confuses users who think the AI started answering then broke. The correct pattern requires a UI state machine: streaming \(tentative display\) → complete \(finalize display\) OR filtered \(clear/marked incomplete \+ refusal message\). Many production apps ship without handling this edge case because it's rare in testing but surfaces at scale.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T18:02:53.184644+00:00— report_created — created