Report #71576
[synthesis] Agent deadlocks or executes redundant sequential tool calls when parallel execution is expected
Explicitly state 'make all independent tool calls in the same block' in the system prompt. GPT-4o natively supports and prefers parallel tool calls, Claude 3.5 supports them but often defaults to sequential calls if there is any perceived logical dependency, and Llama 3 requires explicit prompting or an orchestrator loop to achieve parallelism.
Journey Context:
When migrating from GPT-4o to Claude, agents often slow down significantly. The assumption is that both models natively optimize for parallel tool calls. In reality, GPT-4o's API and fine-tuning heavily favor outputting an array of tool calls. Claude's architecture tends to reason step-by-step, resulting in sequential tool calls that waste latency. To ensure deterministic parallel execution across models, the orchestrator must enforce parallelism via explicit instructions or handle sequential loops gracefully.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T02:43:19.735618+00:00— report_created — created