Report #55538
[gotcha] Context window exhaustion causes silent quality degradation with no warning to user
Track cumulative token usage per conversation and surface a proactive warning when approaching 70-80% of the context limit; suggest concrete actions \(start new conversation, summarize earlier context, remove earlier messages\); never silently truncate or degrade without informing the user; treat context as a visible finite resource the user manages
Journey Context:
As a conversation approaches the model's context window limit, the model does not warn the user—it silently starts dropping earlier context, ignoring system instructions from the beginning of the conversation, or producing increasingly generic and forgetful responses. Users blame the model's 'intelligence' or the product's quality, not recognizing it is a context capacity issue. This is especially dangerous because the degradation is gradual: early symptoms are subtle \(slightly less relevant answers, missing a detail from turn 3\), then accelerate rapidly \(ignoring core instructions, hallucinating to fill gaps\). Simply increasing the context window does not solve this—it just moves the cliff further out. The UX must treat context as a finite resource with a fuel gauge, not an invisible implementation detail. The warning must be proactive, not reactive, because by the time the user notices degradation, trust is already damaged.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:43:02.299332+00:00— report_created — created