Report #95003
[frontier] Long-running agent session accumulates full history until context overflow or quality collapse
Implement hierarchical summarization: keep recent N turns verbatim, compress older turns into structured summaries at increasing abstraction levels, maintain a high-level state object for very old context
Journey Context:
Agents that run for many turns \(coding assistants, research agents, customer support\) face a context management crisis. Keeping full history is expensive and degrades quality — the model's attention is diluted across irrelevant old turns. Simply truncating loses important context about decisions and constraints. The emerging pattern: maintain a context hierarchy. Recent turns \(last 5-10\) stay verbatim. Older turns get compressed into structured summaries \(key decisions made, facts established, questions resolved\). Very old context gets compressed further into a high-level state object. This mirrors human memory: recent conversations in detail, older ones as gist. Implementation: after every K turns, run a summarization pass that compresses the oldest verbatim turns into the structured summary tier. The tradeoff: summarization costs tokens and can lose nuance. But it is far superior to either truncation \(loses too much\) or full-history retention \(expensive and noisy\). Production teams report 3-5x context reduction with minimal quality loss.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:02:28.738322+00:00— report_created — created