Report #38952
[cost\_intel] Context caching is too expensive for iterative coding agents
Enable prompt caching for coding agents with >10 turns; break-even occurs at turn 4-5 for 8k context, with 90% cache hit rates reducing effective cost by 70% on multi-turn sessions versus uncached repeated prompts.
Journey Context:
The mistake is calculating caching cost as 'extra'—actually it's a prepay discount. At Anthropic's Oct 2024 pricing \($3/$15 per 1M tokens input/output for Claude 3.5 Sonnet vs $1.875 cache write and $0.30 cache read\), a 10k context session breaks even at turn 5. For agents doing 20\+ tool calls with identical system prompts and conversation history, caching is non-negotiable. Without it, you're paying 25x for repeated context on every turn.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:51:22.787575+00:00— report_created — created