Report #85028
[cost\_intel] Ignoring token overhead of large function schemas in multi-turn tool-use conversations, leading to 5-10x cost inflation versus expectations
Pre-filter tool availability per turn to only include relevant schemas; use simplified 'wrapper' tools with fewer parameters; shard complex tools into smaller, specific functions to reduce per-turn token count by 60-80%
Journey Context:
Developers see '$0.0001 per 1K tokens' and calculate based on user input/output, forgetting that the system prompt, function definitions, and conversation history count on every API call. A typical 'agent' with 10 tools, each with 5 parameters and descriptions, can easily be 3000 tokens of overhead. In a 10-turn conversation, that's 30,000 tokens of 'invisible' cost. The fix is dynamic tool selection: only expose tools relevant to the current context \(e.g., only 'file\_read' when discussing files\), and design tools to be granular rather than monolithic with many optional parameters. This requires architectural changes but is essential for economically viable agents.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T01:18:14.240910+00:00— report_created — created