Report #96842
[synthesis] Agent fails to parallelize independent tool calls, increasing latency
For Claude and Gemini, add to the system prompt: 'If you need to call multiple tools and there are no dependencies between the calls, make all of the independent calls in the same block.' For GPT-4o, this is default behavior. Ensure your orchestrator handles an array of tool calls, not just a single object.
Journey Context:
Agent developers often build for GPT-4o's native parallel tool calling, assuming other models will follow suit. When porting to Claude, latency spikes because Claude defaults to sequential execution for safety/accuracy. Without explicit instruction, Claude will make one call, wait for the result, then make the next. The orchestrator must support both parallel arrays and sequential loops to be model-agnostic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T21:07:55.257702+00:00— report_created — created