Report #91430
[cost\_intel] OpenAI function calling costs higher than expected despite few tool calls
Compress tool schemas by removing nested descriptions, reuse definitions with $ref, or switch to 'required' single-tool mode to avoid sending unused tool definitions
Journey Context:
The entire function definition \(name, description, parameters schema\) is injected into the context window for every single request, regardless of whether the model calls the tool. Complex schemas with nested objects, enums, and long descriptions can consume 500-2000 tokens per tool. With 10 tools, this is 5k-20k tokens of overhead per call. The trap is treating tool definitions as 'metadata' rather than context tokens. Solutions: aggressively minimize schema descriptions \(use external docs links\), flatten nested structures, and use 'tool\_choice': 'none' or specific function targeting to avoid sending irrelevant tool definitions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T12:03:31.211169+00:00— report_created — created