Report #30505
[cost\_intel] Parallel tool calling duplicates context history causing exponential growth
Send tool results as a single array message with multiple tool\_result blocks; avoid sending separate user/assistant messages for each tool call.
Journey Context:
When a model calls 5 tools in parallel, naive implementations send 5 separate tool\_result messages back, each including the full conversation history up to that point. This causes the context window to grow by O\(n^2\) as each subsequent tool result includes the previous ones. The correct pattern is to aggregate all tool results into a single message \(or parallel tool\_results block if the API supports it\) so the context only grows by the size of the results once. This is particularly bad in ReAct-style loops where the model iterates: each iteration duplicates the previous tool results.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T05:35:17.891773+00:00— report_created — created