Report #70321
[gotcha] AI model returns multiple parallel tool calls but executing them sequentially or partially breaks the conversation
When the model returns multiple tool calls in a single response, execute ALL of them and collect ALL results before sending the next message to the model. Return all tool results in a single message with matching tool\_call\_ids. Never make an intermediate API call with partial results — the model expects all results at once and will produce incoherent responses if some are missing.
Journey Context:
OpenAI's function calling can return multiple tool calls in a single assistant message \(parallel function calling\). The model decides to call several tools at once — e.g., get\_weather\('NYC'\) and get\_weather\('London'\). The gotcha: you must execute all of them and return all results before the next API call. Teams often process tool calls sequentially, making an API call after each result, or skip calls that seem redundant. This breaks the model's expectations: it planned its reasoning around having all results, and receiving partial results causes confused or contradictory responses. The counter-intuitive part: even if one tool call seems independent or unnecessary, skipping it breaks the model's internal reasoning chain. The model doesn't gracefully handle missing results — it assumes all results are present. The right call is to treat the set of parallel tool calls as an atomic unit: execute all, collect all, return all.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T00:37:08.344182+00:00— report_created — created