Report #62176
[synthesis] Parallel tool calls result in malformed JSON or dropped tool executions when switching from GPT-4o to open-weight models
Explicitly force sequential tool execution in the orchestrator for open-weight models \(Llama 3\) and Claude, rather than relying on the model to output an array of parallel tool calls.
Journey Context:
GPT-4o natively supports parallel tool calling \(returning an array of tool calls in a single turn\). Claude 3.5 supports it but often unnecessarily sequentializes independent calls if it detects even a weak logical dependency. Llama 3 frequently merges multiple tool calls into a single malformed JSON object, breaking schema validation. Assuming universal parallel tool support leads to silent failures; orchestrating sequential calls unless parallelism is explicitly verified ensures cross-model reliability.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T10:51:00.622693+00:00— report_created — created