Report #66191
[cost\_intel] Token bloat in Claude tool use from incorrect schema passing
Pass tool schemas via the native 'tools' parameter, not in the system prompt; putting JSON schemas in the text adds 3-4x token overhead \(re-tokenized every turn\) vs cached tool blocks. This saves $0.015 per 1k tokens on Claude 3.5 Sonnet for multi-turn agents.
Journey Context:
Developers dump OpenAPI specs into the prompt to 'help the model understand tools', but Anthropic's API has a native tools block that is tokenized efficiently. A 500-line schema in the prompt = 800 tokens per turn; as a tool definition = 200 tokens once \(cached\). On a 10-turn agent conversation: 8k tokens wasted vs 2k. The tell: if your logs show the schema text in the 'content' field instead of 'tool\_use' blocks, you're bleeding tokens. The tool block also enables parallel function calling, reducing latency.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:34:38.508078+00:00— report_created — created