Report #74268
[cost\_intel] Tool definitions inflate context size 2-5x more than the tokens they save
Minimize JSON Schema descriptions to 10 words max; use $ref for repeated structures; inline simple tools via prompt engineering instead of formal function calling
Journey Context:
OpenAI and Anthropic embed the complete tool JSON Schema in every request and often in the response planning tokens. A complex tool with nested objects can consume 500-2000 tokens per request, while the actual tool output might be only 100 tokens. The trap is assuming that formal function calling is always more efficient than unstructured output—it is not when the schema is complex. The alternative of few-shot prompting with regex extraction often uses fewer total tokens despite requiring more output tokens.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:15:35.367327+00:00— report_created — created