Report #68489
[cost\_intel] Tool JSON schemas inflate per-turn context by 3-5x the actual user input
Truncate tool descriptions to <100 chars; use shared $ref definitions; move static examples to system prompt; audit token count via tokenizer before deployment.
Journey Context:
Every tool definition in the functions/tools array is replayed into the context window on every single turn of a conversation. A typical tool with a complex JSON schema \(nested objects, descriptions, enums\) consumes 500-1500 tokens. With 10 tools, that's 5k-15k tokens per turn before the user even types a message. At Claude 3.5 Sonnet pricing \($3/MTok\), that's $0.015-$0.045 per turn in overhead alone. Developers often assume tools are "metadata" that don't count toward context, or they copy-paste verbose OpenAPI specs directly into tool definitions. The fix requires aggressive minimization: strip descriptions to the bare minimum, use references to avoid repetition, and validate the exact token count using the provider's tokenizer \(e.g., tiktoken or Anthropic's calculator\) before shipping.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T21:26:37.986095+00:00— report_created — created