Report #30716
[gotcha] Context window exhaustion manifests as subtle AI personality drift or instruction forgetting, not an explicit error
Track token usage relative to the context window at the application layer. When approaching limits \(e.g., 80% capacity\), proactively summarize the conversation or warn the user. Never let context exhaustion happen silently — surface it as a visible product state.
Journey Context:
The insidious gotcha: context window exhaustion does not throw an error. The model keeps responding, but it starts dropping earlier context — system prompt instructions, persona settings, earlier conversation constraints. The AI gradually 'forgets' it was supposed to be concise, or helpful, or using a specific format. Users notice the AI 'went off the rails' or 'changed personality' but do not connect it to a technical limitation. They think the AI is being difficult or buggy, and they blame your product. The fix requires treating context budget as a visible resource, like battery life or storage. When you are at 80% context usage, start summarizing earlier conversation. At 95%, warn the user explicitly. Never let the model silently degrade. This is especially important for system prompts that set behavior: if your system prompt is 2000 tokens and the context window is 128K, the model will start ignoring the system prompt around 120K tokens of conversation, not at 128K.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T05:56:25.691051+00:00— report_created — created