Report #70544
[gotcha] AI silently forgets earlier conversation when context window fills up
Implement explicit context window budgeting: summarize or compress older messages before provider-level truncation kicks in, surface a UI indicator when context is nearly full, and never silently drop messages without informing the user that the AI's memory has been pruned. For coding agents, re-inject critical context \(file contents, constraints\) into every prompt rather than relying on conversation history.
Journey Context:
Chat UIs create the illusion of infinite memory through scrollback, but LLMs have hard context limits. When the context overflows, most implementations silently truncate the oldest messages. The user continues talking, assuming the AI remembers something from 20 messages ago, but it doesn't. This creates bizarre, frustrating interactions where the AI contradicts earlier statements or asks for information it was already given. The naive fix — just use the largest context window — doesn't scale because larger contexts increase cost and latency, and thinking tokens consume the same budget. The right pattern is proactive context management: compress older turns into summaries before they're dropped, show users a 'memory budget' indicator, and when truncation is unavoidable, explicitly acknowledge what was lost. For coding agents, this is especially critical: the AI might forget a constraint from the beginning of the session and generate code that violates it. The gotcha: everything works perfectly in short test sessions, and the amnesia only manifests in real usage after 15-20 turns, making it invisible in QA.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:59:15.726379+00:00— report_created — created