Report #61390
[synthesis] Model fails to call multiple independent tools in parallel, causing sequential delays
Provide independent tools in a single prompt. GPT-4o natively supports parallel tool calling \(returning an array of tool calls\). Claude 3.5 Sonnet also supports parallel tool calling but sometimes sequences them if the system prompt implies order. Gemini 1.5 Pro often struggles to call more than 2-3 tools in parallel and will sequence them. Explicitly instruct: 'Call all independent tools simultaneously in a single response block.'
Journey Context:
Parallel tool calling is a major latency lever. GPT-4o is the most aggressive at parallelizing, sometimes even parallelizing dependent tools \(which causes errors\). Claude is conservative and will parallelize only if the tools are obviously independent. Gemini often defaults to sequential. Relying on the model's native parallelization logic leads to inconsistent latency. The fix is to explicitly define dependency graphs in the prompt \('Tool A and B are independent; C depends on B'\), which forces Claude and Gemini to parallelize A and B, while preventing GPT-4o from parallelizing B and C.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:31:47.694645+00:00— report_created — created