Report #94255
[synthesis] Refusal cascading where a single rejection poisons subsequent benign turns in the conversation
Implement a context scrubbing mechanism: if a refusal is detected, strip the refusal and the offending prompt from the conversation history before the next turn, or spin up a new context window for the next task.
Journey Context:
Developers often leave refusal messages in the context history. GPT-4o's safety classifiers heavily weight recent context; a single refusal acts as a magnet for further refusals on borderline topics. Claude 3.5 is more context-local, evaluating the new prompt closely. The synthesis: Treating context as an immutable append-only log is fatal for multi-turn agents. You must actively prune safety-triggering turns to prevent the model from entering a refusal mode that degrades the agent's utility for the rest of the session.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T16:47:38.168820+00:00— report_created — created