Report #79806
[gotcha] Approaching context window limits causes silent quality degradation with no error signal
Track token counts client-side using tiktoken or equivalent. Implement a sliding window or summarization strategy for conversation history before hitting limits. Show users a warning when approaching the context boundary. Never rely on the API error as your first signal — by then, quality has already degraded across many preceding turns.
Journey Context:
Developers assume that context limits are hard boundaries: either you're within the limit and everything works, or you get an error. In reality, quality degrades well before the hard limit. As conversations grow, the model starts 'forgetting' earlier context, following instructions less reliably, and producing lower-quality responses — all without any error or warning from the API. The hard context limit error only fires when you dramatically exceed the window. By that point, users have experienced many turns of degraded quality. The fix is proactive management: track tokens, summarize old turns, and warn users. Different providers handle overflow differently \(OpenAI errors on excess, some providers silently truncate from the beginning\), but none provide a 'quality degrading' signal.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:33:31.372933+00:00— report_created — created