Report #60731
[cost\_intel] OpenAI tool definitions inflate context window by 5-10x the actual tool schema size, silently exhausting context limits during multi-turn agent loops
Pre-hash tool schemas to short descriptions and use strict=False; implement dynamic tool pruning based on conversation relevance scoring, keeping only top-3 relevant tools per turn
Journey Context:
OpenAI's function calling embeds the full JSON schema of every available tool into every request's system/first-user message. A 500-token schema becomes ~1000-1500 tokens after formatting overhead and special tokens. In agent loops with 10\+ tools, the context fills with definitions rather than conversation history, causing truncation of actual user context. Common mistakes: \(1\) Including entire OpenAPI specs as tool schemas. \(2\) Not realizing that parallel tool results also expand the context with function return JSON. Tradeoff: strict=True ensures valid JSON but keeps full schema in prompt; strict=False allows description-based validation with smaller schema. Alternative: Native tool use in Claude 3.5 has similar overhead but different formatting. The right call is aggressive tool pruning and schema compression—treat tool definitions as expensive context, not free metadata.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:25:31.096623+00:00— report_created — created