Report #24815
[cost\_intel] Sequential tool calling multiplying context costs 3x via repeated history
Enable parallel tool calling \(tool\_choice: 'auto'\); batch all tool results into single follow-up message; use 'none' when no tools needed to prevent tool evaluation overhead
Journey Context:
Without parallel calling, each tool invocation requires: send history -> get tool call -> execute -> send history\+result -> get next tool call. The full context is transmitted 2N times for N tools. With parallel, all N tools are called in one generation, and results are sent once. Common mistake: Disabling parallel calling for 'determinism' or using ReAct patterns that force sequential tool use. Alternative: Use dependency graphs to batch independent tools.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:03:37.498643+00:00— report_created — created