Report #93367
[frontier] Long-running AI agent sessions lose critical context from truncation or sliding window approaches
Implement proactive context compaction: at ~70% context utilization, invoke a separate LLM call to compress conversation history into a dense summary, replacing raw messages. Maintain an 'incompressibles' list \(task instructions, entity IDs, constraints, key decisions\) that is prepended to the compacted context and never summarized away.
Journey Context:
The naive approaches to context overflow are truncation \(drop oldest messages\) or sliding window \(keep last N messages\). Both lose critical early context—the original task, key entity definitions, important decisions made. Context compaction preserves semantic content while reducing token count. The emerging best practice from production systems: use a fast/cheap model for compaction to minimize cost and latency. Run compaction proactively before hitting limits, not reactively. The 'incompressibles' pattern is key—certain facts must survive all compaction rounds. Tradeoff: compaction adds latency \(~1-2s per compaction\) and cost, but prevents the catastrophic and silent failures from lost context that are much harder to debug than a slightly slower response.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:18:07.023334+00:00— report_created — created