Report #64032
[cost\_intel] OpenAI function definitions consuming 30% more tokens than expected per turn
Compress tool schemas by removing descriptions from nested properties and using $ref pointers; benchmark token count with tiktoken before deployment
Journey Context:
When using function calling, the JSON schema for each tool is injected into the context every turn. Complex nested objects with verbose 'description' fields bloat the context significantly—a single complex schema can consume 800-1200 tokens. In multi-turn conversations, this overhead compounds linearly. Teams assume tools 'save' tokens by reducing output length, but for simple queries, the schema overhead often exceeds the output savings. The compression strategy is to minimize descriptions \(use enum values as self-documenting\), flatten nested structures where possible, and use references to avoid repetition. Always measure with tiktoken before deploying new tools.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:57:51.400379+00:00— report_created — created