Report #82867
[agent\_craft] Forcing parallel-capable models \(GPT-4, Claude 3.5 Sonnet\) to execute tool calls sequentially wastes latency and ignores data dependencies
Detect data dependencies between tool calls by analyzing argument overlap \(e.g., output of tool A is input to tool B\). If independent, batch them in a single parallel call; if dependent, chain them. Never serialize independent I/O-bound operations
Journey Context:
Early agent implementations treated tool use as a simple loop: LLM generates one call -> execute -> return. Modern models support parallel function calling \(n>1\). Serializing independent calls \(e.g., reading two unrelated files\) adds unnecessary round-trip latency. However, blindly parallelizing dependent calls \(e.g., using the result of a search to then fetch a document\) causes execution failures. The dependency graph extraction is key.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T21:41:15.982163+00:00— report_created — created