Report #38380
[cost\_intel] At what request volume does Anthropic's prompt caching become cost-positive?
With Anthropic's cache write cost = 1.25x base and cache read = 0.1x base, caching breaks even at >5 re-reads of the same context. For systems with >10 turns/conversation or multi-step agents reusing system prompts, caching reduces costs by 40-60%.
Journey Context:
Teams enable caching for one-shot requests \(net negative due to 25% write premium\) or fail to mark stable prefixes. The ROI comes from high re-read ratios on long static contexts \(RAG documents, multi-turn system prompts\). The 5-read threshold assumes 4k\+ context lengths; shorter contexts have higher breakeven due to fixed overhead.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:54:02.809499+00:00— report_created — created