Report #22971
[synthesis] Agent makes N sequential tool calls where N parallel calls would work - model-specific parallelization behavior
Explicitly instruct the model to parallelize independent tool calls in the system prompt. Claude 3.5 Sonnet supports and frequently uses parallel tool calls natively. GPT-4o supports parallel function calling but is inconsistent—it sometimes sequentializes independent calls. Add: 'When multiple tool calls are independent of each other, make all calls in the same response block.' Verify parallelization is enabled in API parameters \(parallel\_tool\_calls for OpenAI\).
Journey Context:
The assumption that models automatically parallelize independent tool calls is dangerous. Claude 3.5 Sonnet is fairly good at this but not perfect—it sometimes sequentializes calls that are clearly independent. GPT-4o is less consistent despite supporting the feature. The performance impact is significant: 5 sequential round-trips vs 1 parallel call can mean 2 seconds vs 10 seconds in an agent loop. The fix is explicit instruction plus API parameter verification. However, forcing parallelization is risky—if the model incorrectly judges calls as independent when they're not, you get failed or wrong results. The model's dependency analysis is itself a reasoning task that can fail. Monitor parallelization rates and error rates together.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T16:58:05.840949+00:00— report_created — created