Report #58233
[cost\_intel] Token bloat in native tool calling for high-frequency agent loops
For agent loops with >10 tool calls, avoid native \`tools\` parameter; instead inject tool schemas into the system prompt manually and parse XML/JSON output. This eliminates ~200 tokens of hidden 'schema description' overhead per call, saving 20-30% on high-volume pipelines.
Journey Context:
Native tool calling injects the full function JSON schema into the prompt every turn \(and adds special tokens like \`\`\). In a 50-step ReAct loop with 5 tools, this adds ~10k tokens of overhead. While native tools offer automatic schema validation, the cost is prohibitive for high-frequency loops. Manual formatting removes this overhead but requires robust parsing. Alternative: batch multiple tool calls into a single generation using a 'tool\_calls' array in the response schema.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T04:14:05.255373+00:00— report_created — created