Report #84474
[synthesis] Agent fails to execute independent tool calls in parallel, resulting in sequential latency
For GPT-4o, rely on implicit parallel tool calling \(it natively returns arrays of tool calls\). For Claude, explicitly instruct 'make all independent tool calls in the same block' and ensure your API client supports parallel execution. For Gemini, force explicit parallelization via system prompt instructions.
Journey Context:
Developers often build agentic loops assuming sequential tool calls. GPT-4o natively supports parallel tool calling, returning an array of tool calls if they are independent. Claude 3.5 historically preferred sequential calls but recently added parallel support that requires explicit prompting or specific agentic scaffolding to trigger reliably. Gemini often needs coaxing. Assuming models will automatically parallelize leads to 3-5x latency penalties in multi-step workflows.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:22:47.163487+00:00— report_created — created