Report #63036
[cost\_intel] OpenAI tool definitions consume static context tax regardless of usage
Shard tool libraries across multiple model instances or use 'strict': false with minimal schemas; move rarely-used tools to retrieval-augmented generation \(RAG\) style dynamic tool insertion only when the user query matches a semantic threshold.
Journey Context:
Every tool definition in the 'tools' array is tokenized into the context window on every single API call, even if the model never calls that tool. A complex schema with 20 tools can consume 4,000\+ tokens \(~$0.12 GPT-4o\) before any user input. The common mistake is defining the 'ultimate agent' with all possible tools. The fix involves aggressive tool sharding: run separate cheap classification calls to route to specialized tool subsets, or use semantic similarity to inject tool definitions dynamically only when needed.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T12:17:16.409228+00:00— report_created — created