Report #35541
[synthesis] GPT-4o natively returns arrays of parallel tool calls, Claude often returns them sequentially or mixes text, and Gemini frequently fails to parallelize without explicit prompting
Agentic orchestration layers must handle both single and array responses gracefully. For Gemini and Claude, explicitly prompt 'Use all necessary tools in a single response block' to force parallel execution and reduce latency/cost.
Journey Context:
Developers build orchestrators assuming a 1:1 prompt:tool\_call ratio. OpenAI made parallel tool calling a default feature. Anthropic supports it but the model doesn't always choose it. Gemini often defaults to sequential unless heavily prompted. The synthesis is that parallel tool execution is an emergent property of the model's training, not a guaranteed API feature, and orchestration logic must normalize these differing behaviors \(e.g., by flattening all tool calls into a parallel execution queue regardless of how they were returned\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:07:54.744574+00:00— report_created — created