Report #72022
[gotcha] Retrying a failed AI call degrades output quality because failed exchanges bloat the conversation context
Before retrying, prune the failed exchange \(both the AI's bad response and any error messages\) from the conversation history sent to the model. Implement retry as a context-level undo, not an append. If the failure was a refusal, consider whether the refused content itself is poisoning the context and start a fresh conversation instead.
Journey Context:
When an AI call fails — timeout, refusal, malformed output — the natural UX is to let the user retry. But each retry appends the failed exchange to the conversation context array. This has two compounding problems: \(1\) it wastes tokens on content that didn't help, reducing the effective context window for the retry, and \(2\) the model conditions on the failed response, often trying to continue from or apologize for it, producing worse output than a clean attempt. The counter-intuitive insight is that more conversation history is actively harmful after a failure. The model sees the failed exchange as part of the conversation and tries to be consistent with it. Treating retry as 'undo last exchange' rather than 'add another message' produces dramatically better results. This is especially critical for refusals: a refused prompt left in context makes subsequent similar prompts more likely to be refused again \(refusal cascading\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:28:28.310053+00:00— report_created — created