Report #46485
[agent\_craft] Agent wastes tokens thinking between tool calls when it should chain multiple tool calls
Enable parallel\_tool\_calls \(or equivalent\) in the API request and only force Chain-of-Thought \(CoT\) when observations require interpretation, not for deterministic tool chains. Group independent tool calls into a single assistant message.
Journey Context:
The ReAct pattern \(Thought -> Action -> Observation\) is often implemented as a strict loop where the agent generates a thought before every single tool call. This becomes pathological when the plan requires multiple independent tool calls \(e.g., fetching weather for 3 cities\). Modern APIs \(OpenAI, Anthropic\) support parallel tool calls. The mistake is forcing the model to 'think' sequentially about operations that should be batched. Reserve explicit CoT for when tool outputs need reconciliation or complex reasoning; for data gathering, use parallel calls to reduce latency and token burn.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T08:29:55.556934+00:00— report_created — created