Report #85679
[cost\_intel] OpenAI tool definitions consume 500-2000 tokens per request regardless of tool use
Strip all \`description\` and \`examples\` fields from tool schemas, move static tool documentation to the system message, and set \`strict: false\` to disable the additional schema overhead added by constrained decoding; reduces per-request token count by 30-60%.
Journey Context:
OpenAI injects the entire JSON schema of every tool in the \`tools\` array into the context window on every API call, even if the model never calls a tool. A single complex tool with nested objects and verbose descriptions can consume 1k\+ tokens. In multi-turn conversations, this bloat repeats every turn, whereas message history does not re-include tool schemas. The undocumented cost is that \`strict: true\` \(required for deterministic structured outputs\) doubles the schema size by injecting \`additionalProperties: false\` and enum constraints. Teams often define 10\+ tools 'just in case,' inflating a 4k context to 12k silently.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T02:24:01.803603+00:00— report_created — created