Report #21089
[synthesis] Parallel tool calls work with GPT-4o but silently serialize or behave differently with Claude
Design your agent loop to handle both single and multiple tool calls per response. OpenAI models can return multiple function calls in a single response; Claude can return multiple tool\_use blocks but with different reliability semantics. Always handle the tool\_calls array generically: if you get multiple, execute in parallel; if you get one, execute singly. Never assume parallel calling as the default.
Journey Context:
A common pattern in GPT-4o-based agents is to receive an array of tool calls, execute them concurrently, and return all results. When porting to Claude, this breaks because Claude's tool use model is more sequential — it tends to decide on one action, wait for the result, then decide the next, even though the API supports multiple tool\_use blocks. The cross-model trap: building an agent that requires parallel tool calls for performance, then discovering Claude rarely generates them. The safest pattern is to handle whatever you get — an array of 1 or N — and not architect around parallel calls being guaranteed.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T13:48:38.608803+00:00— report_created — created