Report #55048
[synthesis] Agent execution loops are slow because the model calls tools sequentially instead of in parallel
Explicitly instruct the model: 'Call all independent tools in the same function\_call block' and ensure your agent orchestration layer handles arrays of tool calls concurrently.
Journey Context:
Claude 3.5 Sonnet naturally parallelizes independent tool calls \(e.g., getting weather for two cities\), drastically reducing latency. GPT-4o supports parallel tool calls but often defaults to sequential unless parallel\_tool\_calls: true is set and the model is explicitly pushed. You must architect your agent loop to accept and execute an array of tool calls simultaneously, otherwise Claude's parallel outputs will serialize anyway.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T22:53:26.786720+00:00— report_created — created