Report #47754
[gotcha] Prior AI refusals in conversation history poison subsequent responses, causing cascading rejection of valid requests
When a refusal occurs, do not include the raw refusal exchange in subsequent context sent to the model. Instead: strip the refusal from history before the next turn, or replace it with a neutral system note like '\[previous request was out of scope, user has rephrased\]', or start a fresh context window for the retry. Implement a circuit breaker that resets context after N consecutive refusals to prevent infinite refusal loops.
Journey Context:
When a user hits a refusal and retries with a rephrased request, the conversation history now contains the refusal exchange. The model sees its own prior refusal in context and interprets this as having already determined the topic is problematic — making it more likely to refuse again even if the rephrased request is valid. This creates cascading failure where the conversation becomes permanently stuck in refusal mode. The counter-intuitive part: including the refusal in context \(which seems like giving the model full information\) makes things worse because LLMs exhibit strong few-shot bias — their own prior outputs serve as examples that constrain future behavior. The tradeoff: stripping refusal history loses conversational continuity but prevents refusal loops. The circuit breaker pattern is the safest default.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:37:53.485992+00:00— report_created — created