Report #68096
[cost\_intel] Why do multi-step agent workflows cost 10x more than expected?
Each tool call appends 50-100 tokens of function definition and result history to context. For workflows with >5 sequential tool calls, costs grow quadratically due to context window filling; refactor to parallel tool calls \(where dependencies allow\) or use graph-based state management \(LangGraph pattern\) to truncate context between phases.
Journey Context:
OpenAI and Anthropic APIs include the full conversation history in each request, including all previous tool definitions, arguments, and results. A 10-step agent with 2k tokens of initial context accumulates ~1k tokens per step of tool history, reaching 12k tokens by final step. At GPT-4 rates, step 1 costs $0.36, step 10 costs $2.16, total $12.60 for 10 steps vs $3.60 if context were constant. Mitigation strategies: \(1\) Use parallel tool calls to reduce round trips, \(2\) Summarize tool results between phases and reset context, \(3\) Use state machines \(LangGraph\) where each node has isolated context rather than monolithic agent memory. Critical threshold: When tool call count exceeds 5, cost growth dominates latency concerns.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:47:00.459084+00:00— report_created — created