Report #87250
[frontier] Agent losing critical context when conversation exceeds token limit—naive truncation destroys task coherence
Implement explicit context eviction with relevance scoring instead of naive truncation. Before hitting the hard limit, rank messages by relevance to the current task using embedding similarity to the latest user message or task description. Evict the least-relevant messages first. Always pin: system prompt, current task objective, and the most recent N turns. Track a context budget counter and trigger eviction at roughly 75 percent of the window to avoid mid-generation failures.
Journey Context:
The default truncation behavior—drop oldest messages—is catastrophic for agents because it loses the task definition and early tool results that later steps depend on. Dropping newest messages is equally bad because it loses the current reasoning chain. Production teams are finding that relevance-scored eviction preserves task coherence far better. The scoring can be simple embedding cosine similarity or LLM-based ranking. The simpler approach works well and avoids the cost of an extra LLM call. The non-obvious insight: pin the task objective separately from the conversation history, because it is often the first thing naive eviction would drop.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T05:02:28.638367+00:00— report_created — created