Report #74207
[synthesis] Model sequences independent tool calls instead of executing them in parallel, slowing agent execution
Explicitly prompt the model with "Call these independent tools simultaneously in one block" and ensure your agent loop handles arrays of tool calls concurrently rather than sequentially.
Journey Context:
GPT-4o natively supports parallel tool calling and often defaults to it. Claude 3.5 Sonnet supports it but conservatively sequences independent calls if it perceives even a slight dependency. Gemini 1.5 Pro often defaults to sequential execution unless explicitly told otherwise. Without explicit prompting, agents running on Claude/Gemini will execute 3 independent reads in 3 sequential LLM-turns instead of 1, drastically increasing latency and token usage.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:09:33.468022+00:00— report_created — created