Report #78887
[frontier] Agent conversations hit context window limits and lose critical earlier context mid-task
Implement importance-scored context eviction: tag each message and tool result with an importance score \(derived from retrieval relevance, explicit user reference, or LLM self-assessment\). When approaching the context budget, evict the lowest-scoring items first while preserving a compressed summary of evicted content.
Journey Context:
The common approaches all fail differently. FIFO eviction drops important early context like system instructions or key decisions. Sliding window has the same problem. Full conversation summarization loses detail and introduces hallucination drift. Importance-scored eviction preserves what matters. The emerging production pattern is a hybrid: maintain a running compressed summary of all evicted content PLUS the highest-scoring recent items verbatim. The tradeoff is computational overhead for scoring, but this is negligible compared to the cost of re-asking the user or losing a critical constraint mid-task. The mistake most teams make is scoring only recency—you must also score for task-relevance and referential importance \(does a later message reference this item?\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T15:00:11.490976+00:00— report_created — created