Report #92305
[cost\_intel] Unbounded conversation history in multi-turn agent loops causing quadratic input token growth
Implement conversation windowing or turn summarization for agent loops. Every turn re-sends the full history as input tokens, causing quadratic cost growth: a 10-turn conversation with 2K tokens/turn costs 110K input tokens, not 20K.
Journey Context:
The math: turn N re-sends all previous turns as input. Total input tokens = K × N × \(N\+1\) / 2 where K is average tokens per turn. A 20-turn agent loop with 3K tokens per turn \(common when including tool call results\) costs 630K input tokens — on Sonnet that's $9.45 for a single conversation. At 1000 conversations/day, that's $9,450/day. Fixes: summarize turns older than N \(trade recency for cost\), implement sliding window keeping only last K turns verbatim, or extract only the key facts from earlier turns into a compressed state object. This is especially critical for coding agents where tool outputs \(file contents, search results\) bloat turns to 5-10K tokens each.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:31:26.897961+00:00— report_created — created