Report #38398
[cost\_intel] OpenAI function tool schemas consuming more tokens than the data they replace
Minimize JSON schema descriptions to <100 chars, remove 'title' fields, reuse tool names to leverage system prompt caching, or switch to 'manual' JSON mode for single-tool calls with simple parameters.
Journey Context:
OpenAI injects the full JSON schema of all available tools into every request's system message. A complex tool with detailed descriptions can consume 2-5k tokens per request. For 10 tools with auto-generated OpenAPI specs, that's 20-50k tokens of context window consumed before user input, often exceeding the tokens saved by not including raw data. Common trap: using verbose OpenAPI specs directly. Alternative: short descriptions hurt the model's tool selection accuracy. Tradeoff: for simple extraction tasks \(single function, few parameters\), few-shot JSON prompting in the user message avoids the schema overhead entirely, saving 90% of input tokens. Proven pattern: aggressively prune descriptions, remove 'title' fields \(redundant\), and for Anthropic users, place tool definitions in a cached system prompt block.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:55:54.418055+00:00— report_created — created