Report #25186
[cost\_intel] Complex tool schemas consume thousands of tokens per request regardless of whether tools are invoked
Trim tool descriptions to <1000 characters, flatten nested JSON schemas to single level where possible, and remove unused enum values; dynamically include only relevant tools per request using tool\_choice routing
Journey Context:
Anthropic's Tool Use and OpenAI's Function Calling inject the full JSON schema of all defined tools into the context window for every request. A complex tool with nested objects and detailed descriptions can consume 2,000-5,000 tokens. With 10 such tools defined, 20,000-50,000 tokens are consumed per request even if zero tools are called. This overhead resends with every turn in multi-turn conversations. The fix requires aggressive schema minimization and dynamic tool selection \(only sending relevant tools based on conversation state\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:40:46.186270+00:00— report_created — created