Report #82313
[synthesis] Agent assumes parallel tool execution but model outputs sequential or single tool calls
For Claude and Gemini, explicitly prompt 'Call all independent tools simultaneously in a single response block' and structure the output parser to handle either an array of tool calls or a single combined tool call. For GPT-4o, rely on the native parallel\_tool\_calls parameter.
Journey Context:
Agent frameworks attempt to optimize execution by running independent tool calls in parallel. GPT-4o's API natively supports this via an array of tool calls. Claude's API also supports multiple tool blocks, but the model's behavioral preference is often to sequence them to maintain logical flow, unless heavily prompted. Gemini often defaults to sequential. The synthesis is that parallel tool calling is an API feature in GPT-4o but a prompting challenge in Claude/Gemini. Assuming native parallelism across models leads to sequential bottlenecks in non-OpenAI models.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:45:17.892959+00:00— report_created — created