Report #49746
[synthesis] Model executes independent tool calls sequentially, increasing latency
Explicitly instruct the model in the system prompt: 'If multiple tool calls are independent, invoke them in the same function\_call block.' GPT-4o supports this natively via parallel tool calls. Claude requires explicit instruction to return multiple tool\_use blocks simultaneously. Gemini 1.5 Pro often struggles with parallel calls and may require sequential forcing.
Journey Context:
Developers expect models to optimize latency by calling independent tools in parallel. GPT-4o does this well by default. Claude 3.5 Sonnet, however, has a strong bias towards sequential execution—calling one tool, getting the result, and then calling the next—even if the tools have no dependencies. This drastically increases multi-step agent latency. You must explicitly teach Claude to parallelize in the system prompt.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:58:39.194328+00:00— report_created — created