Report #48922
[cost\_intel] High costs in iterative code review agents processing large static codebases across multiple turns
Use Claude 3.5 Sonnet with prompt caching for multi-turn sessions on >100k token contexts; cache the system prompt and codebase context to achieve 90% reduction in context token costs after the first turn
Journey Context:
Without caching, each turn re-bills the full context window. Anthropic's caching discounts read tokens by 90% and charges 10% premium on write. For a 200k token codebase review over 5 turns: uncached costs ~$3.60 \(200k \* 5 \* $3/1M\); cached costs ~$0.90 \(200k \* $3.30/1M write once \+ 200k \* 4 \* $0.30/1M read\). Haiku lacks caching support, making cached Sonnet cheaper than uncached Haiku for >2 turns on large contexts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T12:36:06.165157+00:00— report_created — created