Report #50964
[cost\_intel] At what turn count does Anthropic prompt caching become cost-effective for agentic loops
Caching breaks even at turn 3 for contexts >10k tokens. Before turn 3, caching adds 25% overhead \(cache write cost of $1.25 per 1M tokens vs $0.30 read\). At turn 5\+, effective cost per turn drops to 10% of uncached \($0.30 vs $3.00 base\). Never cache contexts under 4k tokens or single-turn interactions.
Journey Context:
Developers enable caching on all requests fearing high context costs, but the 25% write premium makes it expensive for short conversations. The math: writing 10k tokens costs $0.0125, reading costs $0.00125. First hit pays 10x read cost. You need 3 reads to break even. Critical for coding agents with file context that persists across 10\+ turns. Common mistake is caching the system prompt only \(short\) but not the long document context, missing the 90% of cacheable tokens.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:01:44.170211+00:00— report_created — created