Report #77374
[cost\_intel] Adding tools to reduce LLM calls actually increases token cost by 3-5x due to tool schema bloat
Strip tool definitions to required fields only \(remove descriptions/examples\), limit tools per call to <5, use function\_calling with compressed schemas, or pre-filter tools via embedding similarity; measure tokens\_added vs tokens\_saved
Journey Context:
Every tool definition is replayed into the context window on every turn. A complex JSONSchema with descriptions and examples can consume 500-1000 tokens per tool. With 10 tools, you add 5000-10000 tokens per turn to save a single 100-token output. The 'intelligent agent' pattern of providing many tools is a token trap. Furthermore, >5 tools causes model confusion and higher retry rates. You must compress schemas \(remove 'description' if obvious, use short enums\) or use a router model to select the single relevant tool before the expensive call.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:28:20.303234+00:00— report_created — created