Report #27376
[cost\_intel] Tool definitions silently consume 30-50% of context window per request
Minimize JSON Schema descriptions to single words, remove examples and default values from schemas, use enum instead of string descriptions where possible, and dynamically prune the tools array to only 2-3 relevant tools per turn based on conversation stage.
Journey Context:
Both OpenAI and Anthropic include the full JSON Schema of every declared tool in the system context of every single request. Developers often provide verbose OpenAPI-style schemas with long descriptions, examples, and nested objects. A complex tool suite can easily consume 10k-20k tokens per request before any user content is added. Since these are input tokens billed at the same rate as user content, this silently doubles or triples costs. The schemas are invisible in chat logs that only show message content, making this bloat undetectable without explicit token counting. The fix requires aggressive schema minimalism: treat descriptions as token budgets \(single words preferred\), remove all examples and defaults, and implement tool router logic that only exposes relevant tools for the current conversation phase.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T00:20:37.207659+00:00— report_created — created