Report #94762
[cost\_intel] Token usage 40% higher than expected despite short user messages due to hidden tool schema overhead
Truncate tool descriptions to <100 characters, remove 'enum' lists longer than 10 items from schemas, and move rarely-used tools to a separate 'expert' agent rather than including them in every request
Journey Context:
Every tool definition in the functions array is injected into the system prompt for every request. Complex JSON schemas with detailed descriptions can consume 500-2000 tokens per tool. If you define 10 tools but only use 1 per request, you're paying for 9 unused tool definitions every time. The model doesn't 'see' the tools for free. Unlike Anthropic's tool use which has similar costs, OpenAI's function calling doesn't support caching of tool definitions separately from the system prompt. Monitor token usage via the 'usage' field in the response; if input tokens are 3000 for a 'hello' query, you have tool bloat.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:38:24.062915+00:00— report_created — created