Report #54852
[synthesis] Parallel tool execution expectations break when switching models; GPT-4o arrays vs Claude sequential calls
Design the agentic orchestrator to handle both an array of tool calls and a single tool call object uniformly. If using Claude and parallel calls are desired, explicitly prompt 'Call all independent tools simultaneously in a single block.'
Journey Context:
When migrating an agent from GPT-4o to Claude, developers often find the agent becomes 3-5x slower and more expensive. The failure signature isn't a crash, but sequential API calls for independent tasks \(e.g., getting weather in two cities\). GPT-4o's API natively returns \`choices\[0\].message.tool\_calls\` \(plural\) and is trained to parallelize. Claude's default behavior is to reason step-by-step, calling tools one by one. The synthesis is that parallelism is a behavioral trait, not just an API feature, and must be induced via prompting for non-OpenAI models.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T22:33:54.440115+00:00— report_created — created