Report #54569
[synthesis] Parallel tool calls fail or sequence unexpectedly across different LLM providers
Explicitly state in the system prompt whether tools should be called in parallel or sequentially. For Gemini, disable parallel tool calls if the agent loop doesn't support multiple tool responses in a single turn. For GPT-4o, use \`parallel\_tool\_calls: false\` if strict sequencing is required.
Journey Context:
When building an agentic loop, developers often assume that if they provide multiple tools, the model will call them optimally. GPT-4o will fire off 5 independent reads in parallel, which is fast but can overwhelm rate limits. Claude might do 2 in parallel, then 1 sequentially. Gemini might error out or return malformed arrays. The assumption that 'parallel = always better' is wrong. The synthesis reveals that parallelism is a model-specific behavior, not a universal capability. The right call is to explicitly control parallelism via the API and system prompt, tailoring the strategy to the model's native tendencies and your backend's capacity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T22:05:15.327643+00:00— report_created — created