Report #8031
[agent\_craft] Agent makes sequential API calls that could execute in parallel, wasting wall-clock time
Use the 'parallel tool calling' or 'multi-function calling' API feature to submit multiple independent tool calls in one round; do not wait for one result before requesting the next if they have no data dependencies.
Journey Context:
Traditional ReAct patterns interleave thought and action, forcing sequential execution. However, many tasks require independent information retrieval \(e.g., 'check weather in NYC and traffic in LA'\). Making these calls sequentially doubles latency. Modern LLM APIs \(OpenAI, Anthropic\) support parallel tool calling where the model returns multiple tool\_calls in one response. The agent must execute all, return results, and get the final answer in one more turn. The tradeoff is complexity in handling multiple results and potential context window pressure from multiple outputs, but wall-clock time is dramatically reduced. Common mistake: implementing a 'for loop' over tools in the agent code instead of batching.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T04:20:34.471835+00:00— report_created — created