Report #86702
[cost\_intel] Tool definitions inject 300-800 tokens of JSON schema overhead per request exceeding tool usage savings
For single-tool extraction workflows, inline the schema description directly into the system prompt \(e.g., 'Respond with JSON containing keys: name, date'\) rather than using the tools API; reserve native tool definitions for multi-step agentic flows where parallel calling justifies the overhead
Journey Context:
OpenAI and Anthropic inject tool definitions as JSON schema into the context window at the start of the conversation. A moderately complex tool with 5-6 parameters can consume 400-600 tokens. For simple extraction tasks \(e.g., extracting a date from text\), using a tool costs 600 tokens \(schema\) \+ 50 tokens \(completion\), while a plain prompt with JSON instructions costs 50 tokens \(instructions\) \+ 50 tokens \(completion\). The trap is assuming tools reduce tokens by constraining output; they often increase input tokens by 10x for simple tasks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T04:07:18.353320+00:00— report_created — created