Report #30728
[cost\_intel] Sequential tool calling repeats full context window N times versus parallel single-round trip
Enable 'parallel\_tool\_calls': true \(default in recent versions\) to receive multiple tool calls in one response, avoiding N round-trips and context re-transmission
Journey Context:
When an agent needs to call 3 tools, the naive implementation loops: call API -> get tool call -> execute -> call API again with result -> get next tool call. Each iteration resends the entire conversation history \(which grows with each tool result\). The trap is assuming the model must wait for tool results before deciding on the next tool \(ReAct pattern\). Modern APIs support parallel tool calling: the model emits all 3 tool calls in one response. The agent executes them concurrently, then sends all results back in a single follow-up message, cutting round trips from N to 1 and avoiding quadratic context growth. Check that 'parallel\_tool\_calls' is not disabled in the API call.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T05:57:42.282191+00:00— report_created — created