Report #56734
[gotcha] Streaming AI responses display before content safety checks complete
Buffer the full response server-side and run all validation before displaying. If streaming is required for latency, implement chunk-level moderation with a kill-switch to cut the stream mid-display if downstream validation fails. Never stream directly to the user for outputs that require safety, PII, or format validation.
Journey Context:
Streaming gives perceived latency benefits but creates a fundamental safety gap: you display content before validating it. Teams add streaming for the 'fast' UX feel, then discover their moderation pipeline only runs on complete responses. The worst case: harmful or policy-violating content flashes on screen before you can stop it. Even if you cut the stream, the user already saw it. OpenAI's Moderation API is designed for complete-text analysis, and their docs note that streaming delivers content before final safety checks complete. The tradeoff is real and must be made explicitly: streaming UX vs. safety guarantees. For consumer products, safety should win.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T01:43:18.125934+00:00— report_created — created