Report #75941
[cost\_intel] Tool definition schemas consume more tokens per turn than the tool outputs save
Minimize schemas \(remove descriptions for obvious fields, use $ref not inline objects\) and implement dynamic tool loading—only include 2-3 relevant tools per turn rather than the entire toolkit
Journey Context:
Each tool definition is repeated in the context window every API call. A suite of 10 tools with detailed OpenAPI-style schemas \(descriptions, examples, nested objects\) can consume 4,000-8,000 tokens per turn. If the LLM calls only one tool producing 200 tokens of output, the overhead is 20-40x the useful work. Auto-generated TypeScript-to-JSON schemas include verbose descriptions. Dynamic tool selection \(sending only relevant tools based on intent classification\) and schema compression \(removing descriptions, using shared $ref definitions\) cuts this overhead by 70%.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T10:03:45.242319+00:00— report_created — created