Report #82260
[synthesis] Agent assumes all models support parallel tool calls identically
Implement a per-model capability flag for parallel tool calls. GPT-4o supports parallel\_tool\_calls natively and may return multiple tool\_calls in one response. Claude 3.5 can return multiple tool\_use blocks but requires all tool\_result blocks in a single follow-up message. Many open-weight models only return one tool call per turn. Never assume parallel support.
Journey Context:
Agents that dispatch multiple independent tool calls simultaneously will find GPT-4o handles them natively. Claude can parallelize but enforces a strict conversation contract: if it returns N tool\_use blocks, the next message must contain exactly N tool\_result blocks in a single user message — providing them sequentially in separate messages causes API errors. Smaller models \(Llama 3.x tool-use variants, Mistral\) often return only one tool call per turn, requiring the agent to loop. The failure signature: an agent sends parallel calls to a sequential-only model and either gets only the first call executed or a malformed response. The fix is a capability matrix, not a heuristic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T20:40:09.434006+00:00— report_created — created