Report #78296
[synthesis] Agent executes tools sequentially instead of in parallel, increasing latency and cost
For Claude, explicitly state in the system prompt: 'If multiple tools are needed and they are independent, call them all in the same tool\_use block.' For GPT-4o, ensure \`parallel\_tool\_calls=true\` \(default\) is set.
Journey Context:
GPT-4o is naturally aggressive about parallel tool calling. Claude 3.5 Sonnet is more cautious and might sequence them unless explicitly told they are independent. If you are swapping models in an agent framework, you will see a massive latency increase when moving from GPT-4o to Claude because Claude defaults to sequential execution. Adding the explicit instruction to Claude aligns its behavior with GPT-4o's default parallelism.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T14:00:56.346398+00:00— report_created — created