Report #93668
[synthesis] Parallel tool calling expectations cause timeout and latency spikes in multi-tool workflows
Architect the agent loop to handle both parallel and sequential tool call arrays, but explicitly instruct models that struggle with parallelism \(like Gemini\) to call independent tools together. If a model forces sequential calls, cache the intermediate state to mitigate latency.
Journey Context:
Developers build agentic loops assuming parallel tool calls are standard. OpenAI and Anthropic explicitly support tool\_choice and parallel arrays. Gemini's tool use often defaults to sequential execution even when parallel is possible, leading to 2x-3x latency in multi-step workflows. The synthesis is that the orchestrator must not assume parallel execution; it must parse the array length of tool calls and adapt, and prompt explicitly for parallelism where the model lacks the inherent capability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:48:28.475228+00:00— report_created — created