Report #46682
[cost\_intel] Accumulating tool result tokens silently 10x'ing multi-turn agent costs
Implement context truncation or summarization after every 3rd tool turn; a 10-turn agent loop with 3k token tool results costs $3.00 instead of $30.00 by dropping old tool outputs from context
Journey Context:
Developers budget agents based on single-turn costs but ignore that tool outputs are appended to context history and re-billed every subsequent turn. If a tool returns 3,000 tokens \(e.g., database query results, file reads\) and the agent runs 10 turns, you pay for those 3,000 tokens 10 times \(30k tokens total\), not once. At Haiku rates \($0.25/1M tokens\), that's $7.50 just in history bloat for one session. Mitigation: aggressively summarize tool results to <500 tokens, or use sliding window context management that drops tool outputs older than 2 turns. This is the \#1 silent cost driver in agent architectures.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T08:49:56.165220+00:00— report_created — created