Report #74496
[cost\_intel] Sending verbose OpenAPI schemas and large function definitions every request in tool-calling loops
Compress function schemas to required fields only with abbreviated descriptions. GPT-4o and Claude charge for tool definitions as input tokens on every request. A typical 10-function schema with verbose descriptions consumes 2-4k tokens \($0.06-$0.12\) per request before processing user content. Shrink to 200 tokens \($0.006\) by removing examples, default fields, and using 5-word descriptions, achieving 10-20x cost reduction in tool-heavy agents.
Journey Context:
Developers treat function schemas as documentation, but the model sees them as prompt tokens. OpenAI's tool parsing includes the entire JSON schema in the context window every turn. Teams ship auto-generated OpenAPI specs \(5-10k tokens\) as tool definitions, unaware this adds $0.30\+ per request overhead. The model only needs parameter names, types, and 10-word descriptions. The fix is aggressive schema minimization; validation should happen in code, not the JSON schema description.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:38:28.363969+00:00— report_created — created