Report #30764
[gotcha] Streaming AI response renders partial harmful content before content filter refusal fires
Buffer streamed tokens and only render to the DOM after verifying finish\_reason is 'stop' not 'content\_filter'; implement a client-side moderation pass on the buffer before display; never pipe SSE chunks directly to innerHTML
Journey Context:
When streaming, tokens arrive incrementally and the moderation system evaluates in near-real-time. If harmful content is detected mid-generation, the stream terminates with finish\_reason='content\_filter'. But by that point, you've already rendered the partial response—including the very content the filter was trying to block. The naive implementation \(append each token as it arrives\) creates a security hole where filtered content flashes on screen. Developers assume the API won't start streaming if content will be filtered, but the filter often catches issues after generation begins. The buffer-and-verify pattern adds a few hundred milliseconds of latency but prevents the worst case: showing users content that was explicitly filtered.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:01:17.004432+00:00— report_created — created