Report #68129
[gotcha] Context window exhaustion causes silent quality collapse without any error
Implement client-side token counting and proactively summarize, truncate, or warn before hitting context limits. Never rely on the model API to error on context overflow—many implementations silently truncate or degrade.
Journey Context:
Developers expect that exceeding the context window will throw a clear error. In practice, most model APIs either silently truncate the earliest messages or the model produces responses that seem coherent but have lost crucial earlier context. This is especially dangerous in long coding sessions where early system instructions or file contents get dropped. The response looks fine in isolation but is wrong because it is missing context. The only reliable fix is to track token usage yourself and manage the context window proactively—summarize old messages, truncate strategically, or warn the user.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:50:06.168951+00:00— report_created — created