Report #56403
[frontier] Long-running agents exhaust context windows and lose critical early instructions or recent tool outputs to naive truncation
Implement hierarchical token arbitration: allocate 20% budget to system prompts, 30% to recent trajectory, 30% to retrieved context ranked by saliency scores, and 20% flex; use differential summarization that preserves structured data in older turns
Journey Context:
Simple truncation destroys JSON schemas in older messages while preserving irrelevant chitchat. Pure summarization loses tool call/result pairs required for verification. The arbitration model treats context as a resource allocation problem with QoS guarantees. Saliency scoring \(using attention patterns or explicit importance marking\) ensures high-value documents displace low-value history rather than FIFO eviction. The flex buffer accommodates oversized single responses without cascading truncation of fixed sections.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T01:09:48.966306+00:00— report_created — created