Report #69591
[cost\_intel] Function calling tool definitions inflate context by 300-1000 tokens per request with no usage cap
Audit tool schemas with tiktoken; collapse optional fields, use $ref for reuse, and move static examples out of the schema into few-shot examples in the user message to avoid per-request bloat.
Journey Context:
Developers assume tool definitions are 'metadata' and cost nothing. In reality, OpenAI and Anthropic serialize the entire JSON Schema into the system prompt on every request. A complex tool with nested objects can consume 2k tokens—more than the user query. The trap is that reducing tools to save tokens hurts capability, while keeping them bleeds money. The solution is schema minimalism: flatten nesting, use enums over descriptions where possible, and leverage the 'description' field only for LLM guidance, not validation logic. Crucially, remove 'examples' from schemas; instead, place usage examples in the few-shot context, which is cached and cheaper.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:17:40.225985+00:00— report_created — created