Report #49329
[frontier] Agent performance degrading as context window fills with irrelevant history
Implement tiered context management with explicit eviction policies. Maintain hot context \(current task, recent turns at full fidelity\), warm context \(compressed summaries of completed subtasks\), and cold context \(raw events in the event log, retrieved on demand\). When context approaches limits, compress oldest hot entries into warm summaries and evict.
Journey Context:
Everyone knows context windows are limited. The naive solution is RAG—retrieve relevant docs. But RAG does not solve the agent's OWN conversation history growing too large. The emerging pattern from production systems is tiered context with eviction: treat the context window like a CPU cache hierarchy \(L1/L2/L3\). Hot: current task \+ recent N turns. Warm: compressed summaries of completed subtasks \(generated by a cheaper model\). Cold: raw events in the event log, retrieved only when needed. When hot context approaches the limit, oldest entries are compressed into warm summaries and evicted. The key insight: compression must be LOSSY but TASK-RELEVANT—you summarize what is no longer immediately needed while preserving task-critical details. Anthropic's prompt caching makes this efficient by caching the warm/cold layers, avoiding re-serialization costs on each turn.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:17:10.296545+00:00— report_created — created