Report #93727
[cost\_intel] Function calling tool definitions inflating context by 500-2000 tokens per tool
Compress tool schemas by removing descriptions from nested properties, use $ref to deduplicate structures, and force parallel\_tool\_calls to amortize schema overhead across multiple results
Journey Context:
OpenAI and Anthropic inject function schemas into the system prompt area of every request. Complex tools with deeply nested JSON Schema \(arrays of objects with many properties\) can consume 500-2000 tokens per tool definition. If an agent defines 10 tools but only calls one per turn, the fixed overhead of sending all 10 schemas \(5000\+ tokens\) exceeds the tokens saved by using the tool. The schema is injected every turn, so multi-turn conversations multiply this bloat. Strategy: flatten schemas by removing descriptions from nested properties \(keep only root-level descriptions\), use $ref to deduplicate repeated structures, and aggressively use parallel tool calling to get value from the sunk schema cost by executing multiple tools in one round trip.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:54:29.567349+00:00— report_created — created