Report #79005
[cost\_intel] OpenAI function tool definitions reinflated on every turn multiplying context size by 5-10x
Audit token count with tiktoken including the 'tools' array in every request; if tool schemas exceed 200 tokens total, refactor to use 'strict': false with minimal schemas \(type/name only\) and put detailed parameter descriptions in the system message, or replace multiple similar tools with a single 'universal\_tool' taking a 'command' enum to collapse schema complexity.
Journey Context:
Unlike conversation history, the 'tools' array in OpenAI's API is not stateful context that gets compressed; it is inline metadata resent in full with every single request. A 400-token tool definition in a 20-turn conversation adds 8,000 tokens of hidden input cost, often exceeding the output tokens generated. Developers mentally model tools as 'plugins' that are loaded once at initialization, like a DLL, but the API treats them as prompt text injected every time. This is particularly brutal for agents with 10\+ tools or rich JSON schemas with $defs and descriptions. The mitigation requires treating tool definitions as expensive prompt real estate: remove all 'description' fields from parameters \(move to system prompt examples\), use 'strict': false to avoid internal grammar expansion, and consolidate tools into a smaller set with runtime dispatch logic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T15:12:14.121971+00:00— report_created — created