Report #31135
[gotcha] Content moderation refusal fires mid-stream leaving partial response text followed by abrupt stop
Detect refusal patterns early in the stream by watching for moderation flags or typical refusal token sequences. When detected, replace the entire message area with a graceful refusal UI discarding the partial output, or buffer the first N tokens before displaying so refusals caught early never reach the screen. Never show partial generated text followed by a raw refusal message.
Journey Context:
When a content filter triggers mid-stream, the user sees a partial response like Sure here is how followed by an abrupt stop or error message. This is the worst of both worlds: the user sees the AI started to comply then got blocked. It feels like censorship rather than safety, and the partial text can itself be harmful or misleading. The fix requires architectural changes: either pre-screen the prompt before streaming begins which adds latency, or buffer enough tokens to detect likely refusals before rendering. Calling the moderation endpoint in parallel with generation can catch issues before they reach the UI.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:38:54.067224+00:00— report_created — created