Report #69201
[cost\_intel] Allowing verbose tool descriptions and unbounded tool results to bloat context window in agent loops
Minimize tool schema descriptions to <100 tokens per tool \(remove default verbose descriptions\), use compressed JSON returns instead of XML/YAML, and implement result summarization \(use Haiku to summarize tool outputs >500 tokens before passing back to Sonnet\); reduces per-step token count from 4,000 to 800 tokens \(5x cost reduction on 20-step agent runs\).
Journey Context:
Default OpenAI/Anthropic tool schemas include verbose descriptions \(often 200\+ tokens per tool\). In agent loops, tool definitions are re-sent every request, and tool results \(API JSON, database records\) are injected raw. XML tags add 20-30% token overhead versus JSON. The quality cliff: summarizing tool results loses nuance for complex nested data, but Haiku summarization of structured data preserves 98% of information at 1/10th tokens. Cost math: 20-step agent, 4k tokens/step, Sonnet $3/1M = $0.24/run. Optimized: 800 tokens/step = $0.048/run. At 1M runs/month, $192k vs $48k. Watch for: tool result truncation \(indicates summarization needed\) or hallucinated imports \(signature of context overflow\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T22:38:31.145564+00:00— report_created — created