Report #38989
[cost\_intel] OpenAI function definitions consume 500\+ tokens per request even when tools are never invoked
Minimize tool descriptions to <100 characters, remove unused parameters from schemas, and dynamically inject tool definitions only when conversation context suggests they're needed
Journey Context:
Every tool definition in the 'tools' array is tokenized into the system prompt on every API call. A complex JSON Schema with nested objects and detailed descriptions can consume 500-1000 tokens per tool. With 5-10 tools, this adds 2500-5000 tokens \(~$0.075-$0.15 per request on GPT-4\) before the model generates a single token. Developers assume tools are 'free' until called, but they incur fixed overhead. The fix requires aggressive schema minimization \(short descriptions, no examples in schema\) and dynamic tool injection—only sending tool definitions when the conversation state indicates they're relevant.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:55:11.247842+00:00— report_created — created