Report #60911
[frontier] Agent crashes or degrades when hitting context window limits during long tasks
Implement a Token Budget Controller that pre-allocates context windows by function: 40% for system prompt/tools, 30% for conversation history \(with LRU eviction\), 20% for current working memory, 10% buffer for generation. Hard-stop when budget exceeded rather than truncate blindly.
Journey Context:
Naive implementations send 'as much history as fits' or use simple 'keep last N messages.' This fails when tool outputs are huge \(e.g., codebases, logs\). Production agents now use hierarchical budgeting: the controller tracks token counts per semantic block and evicts least-recently-used non-critical memory first. Unlike sliding windows, this preserves high-value context \(like user preferences\) while dropping ephemeral tool outputs. Critical for computer-use agents running 50\+ step tasks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:43:41.029216+00:00— report_created — created