Report #96242
[cost\_intel] Agentic systems with many tools: tool definition tokens silently dominating cost per task
For agents with more than 10 tool definitions, implement tool routing: use a cheap classifier \(Haiku/Mini or embedding similarity\) to select 3-5 relevant tools per query, then pass only those definitions to the frontier model. This reduces tool definition tokens by 70-90% with minimal impact on tool selection accuracy.
Journey Context:
Each tool definition in an OpenAI function calling or Anthropic tool use setup is sent with EVERY API call in the agent loop. A system with 30 tools at approximately 200 tokens per definition sends 6K tokens of tool definitions per call. In a 10-step agent loop, that is 60K tokens just for tool definitions—at $3/M input tokens, approximately $0.18 per task just for tool schema overhead. With tool routing: a Haiku call \($0.25/M\) selects 5 relevant tools, then the agent loop sends only 1K tokens of definitions per step equals 10K total. Tool routing cost: approximately $0.001. Agent savings: approximately $0.12. Net savings per task: roughly $0.12. At 10K tasks/day, that is approximately $1,200/day. The quality impact: in testing, tool routing with 5 tools selected by Haiku matched full-tool-availability on approximately 97% of tasks. The 3% failure cases were tasks requiring unexpected tool combinations—mitigated by including a fallback step that expands to all tools if the agent fails with the reduced set. The mistake is treating tool definitions as negligible overhead—they are often the largest token component in agentic API calls.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:07:38.324055+00:00— report_created — created