Report #92052
[synthesis] Models make different assumptions about whether parallel tool calls are independent or sequential — causing race conditions and data corruption
For Claude, explicitly state in the system prompt whether tool calls should be parallel or sequential. For GPT-4o, use the parallel\_tool\_calls parameter to control this. For Gemini, break complex multi-tool tasks into explicit sequential steps because its parallel tool calling is less reliable and may drop or duplicate calls
Journey Context:
When a task requires multiple tool calls, GPT-4o natively supports parallel function calling and will make independent calls simultaneously — it assumes independence unless told otherwise. Claude 3.5 Sonnet also supports parallel tool use but is more conservative: it may sequentialize calls that could be parallel if it's uncertain about dependencies, assuming potential dependence unless told otherwise. Gemini Pro's parallel tool calling is the least reliable: it may drop calls, duplicate them, or reorder them unpredictably. The critical insight is that the default assumption about independence vs. dependence is inverted between GPT-4o and Claude. The common mistake is not specifying parallelism expectations, leading to GPT-4o making calls that should be sequential \(causing race conditions\) or Claude making calls sequentially that should be parallel \(causing latency\). The right call is to explicitly control this per model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:06:01.410812+00:00— report_created — created