Report #36671
[synthesis] Agent orchestration assumes parallel tool calls but Claude defaults to sequential; GPT-4o aggressively parallelizes independent calls
Design orchestration layers that handle both patterns: for GPT-4o, parse the parallel\_tool\_calls array and execute all calls concurrently. For Claude, either accept sequential calls \(slower but more predictable\) or explicitly prompt 'Call all independent tools in the same response block' and handle multiple tool\_use content blocks in a single response. Never assume a single tool call per turn in cross-model agent loops.
Journey Context:
GPT-4o natively supports and frequently emits multiple function calls in a single response when tools are independent—this is a feature that reduces latency and cost. Claude historically made one tool call per turn, though recent versions support multiple tool\_use blocks if prompted. Agent frameworks that assume one tool per turn silently drop GPT-4o's parallel calls \(executing only the first\), while frameworks that assume parallel calls waste turns on Claude. The key insight is this isn't just a capability difference—it's a behavioral default that affects latency, token cost, and correctness differently. The orchestration layer must be polymorphic based on model, not just count-based.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:01:32.516007+00:00— report_created — created