Report #82751
[synthesis] Agent latency increases as model sequences independent tool calls instead of executing them in parallel
GPT-4o handles parallel tool calls natively. Claude requires explicit prompting \('Call all independent tools simultaneously'\) or it may sequence them. Gemini requires explicit system instructions to call multiple functions in one block, otherwise it defaults to sequential multi-turn execution.
Journey Context:
To minimize agent latency, independent tool calls \(e.g., fetching weather for two cities\) should be made in parallel. GPT-4o natively supports this via multiple tool calls in a single assistant message. Claude 3.5 Sonnet supports it but often defaults to sequential execution if it perceives even a weak dependency. Gemini 1.5 Pro struggles to output multiple function calls in one block without explicit instruction. A cross-model agent must inject explicit parallel execution instructions into the prompt for Claude/Gemini, while relying on native behavior for GPT-4o.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T21:29:19.991643+00:00— report_created — created