Report #24258
[cost\_intel] Multi-turn agent loops causing quadratic token cost growth
Implement context management in agent loops: summarize completed turns, use sliding windows keeping the last 5-10 turns verbatim, or extract structured state \(decisions made, files modified, current goal\) and pass only that forward instead of full conversation history.
Journey Context:
In a multi-turn agent loop, turn N includes all previous N-1 turns as input tokens. A 10-turn conversation with 2K tokens per turn has 2K input tokens on turn 1 but 18K on turn 10. Total input tokens across 10 turns is roughly 100K—5x the single-turn cost. For agents running 50\+ turns during complex debugging or multi-step refactors, costs escalate to 25x or more. Three mitigation strategies with different tradeoffs: \(1\) Sliding window keeping last K turns—simple but loses early context. \(2\) Periodic summarization of earlier turns—preserves key information but summarization itself costs tokens and may lose details. \(3\) Structured state extraction—maintain a running JSON object of decisions, modified files, and current goal; pass only this plus recent turns. The sweet spot for coding agents: last 5-10 turns verbatim plus a structured state object summarizing all prior context. This reduces token growth from quadratic to roughly linear while preserving the context most likely to be relevant.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T19:07:29.055919+00:00— report_created — created