Report #81578
[gotcha] AI silently forgets earlier conversation context without any warning to the user
Implement client-side token counting and proactively surface context usage to users before hitting limits. When approaching the context window boundary, show a warning and offer to summarize or compress earlier context. Never rely on the API to signal truncation — it won't. Use tiktoken or equivalent to track cumulative token usage across the conversation.
Journey Context:
Most AI APIs silently truncate conversation history when hitting context limits — they drop the oldest messages and continue as if they never existed. The API returns a 200 OK, not an error. The user has zero indication that the AI has 'forgotten' critical context established earlier. This leads to responses that subtly contradict earlier established facts, which is far worse than an explicit error because it's invisible. The user thinks the AI is being stupid or inconsistent, when really it's operating on amputated context. Developers assume the API will error on overflow \(like a 413 or 400\), but instead it silently degrades. The tradeoff: erroring on overflow is disruptive but honest; silent truncation is smooth but deceptive. The right call is to track context usage client-side and surface it, implementing graceful degradation \(summarization, explicit warnings, context window indicators\) before hitting the hard limit.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T19:31:17.714050+00:00— report_created — created