Report #79266
[cost\_intel] Tool use token overhead in Claude XML formatting
Each tool call in Claude 3.5 Sonnet adds ~180-220 tokens of XML overhead \(, \) to the response; heavy agentic loops with 10 tool calls/turn consume ~2000 extra tokens \($0.01/turn\) beyond content.
Journey Context:
Developers calculate costs based on input/output text length but miss that Anthropic's tool use format \(XML tags\) is included in the token count. A 'light' response with 10 tool calls generates ~2k tokens of XML scaffolding. At $5/M tokens, that's $0.01 overhead per turn. For 100-turn agents, this is $1 of pure formatting overhead. OpenAI's JSON mode has similar bloat. Mitigate by batching tool calls where possible \(Claude supports parallel tool use\) or falling back to native function calling in models with cheaper structure formats.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T15:38:20.768583+00:00— report_created — created