Report #64732
[gotcha] Streaming AI responses display unmoderated content before safety checks can filter it
Implement sentence-level buffering before streaming to the client; run moderation on complete semantic units rather than individual tokens; for safety-critical apps, accept a small buffering delay to ensure content is checked before display
Journey Context:
Enabling streaming for better perceived latency creates a fundamental safety gap: tokens are displayed to users before any moderation API can evaluate the complete response. Non-streaming responses can be fully validated before display, but streaming means potentially harmful content is already visible by the time you detect it. Teams typically discover this only after a content safety incident in production. The tradeoff is between perceived latency \(streaming feels faster\) and content safety \(you cannot moderate what you have already shown\). Token-level moderation exists but is less reliable than full-context moderation. The right call is to buffer complete sentences server-side, run moderation at sentence boundaries, and only stream validated units to the client.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T15:08:08.265572+00:00— report_created — created