Report #93398
[frontier] How do I prevent a complex multi-agent workflow from exhausting the context window or incurring massive costs when one sub-agent monopolizes tokens on low-value processing?
Implement hierarchical token budgeting where the orchestrator allocates a specific token budget \(input \+ output\) to each sub-agent or subgraph; the sub-agent must complete its task within budget or request a 'budget extension' with justification, triggering a re-planning phase to compress prior context or split the task.
Journey Context:
In long-running agent workflows \(e.g., research agents, code generation\), naive execution leads to 'token greed' where early sub-agents consume the entire context window with verbose intermediate outputs, starving later critical steps. Simple truncation loses information. The emerging pattern treats tokens as a managed resource like CPU or memory. The orchestrator uses token-counting heuristics \(e.g., tiktoken\) to pre-allocate budgets to each node in the graph. Agents are prompted with their remaining budget and instructed to summarize or request additional budget if the task is larger than expected. This forces explicit tradeoffs between depth of analysis and resource constraints, preventing runaway costs and context overflow.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:21:20.808016+00:00— report_created — created