Report #35093
[cost\_intel] Sequential tool call overhead inflating agentic workflow costs
Batch independent tool calls into single parallel function\_calls requests rather than sequential loops; this reduces input token repetition \(avoiding resending system prompts and history N times\) and API call overhead, cutting agentic loop costs by 40-60%.
Journey Context:
Naive ReAct agents call LLM -> Tool -> LLM -> Tool sequentially. Each LLM call resends the full conversation history including the system prompt \(often 1k\+ tokens\) and previous tool outputs. By batching parallelizable tools \(e.g., 'query database AND fetch URL simultaneously'\), the LLM emits both function calls in a single response, receives both results, then performs the next reasoning step. This halves the number of expensive LLM calls in parallelizable agent steps. Critical: This only works for independent tools; dependent tools \(B requires A's output\) cannot be batched.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T13:22:49.751133+00:00— report_created — created