Report #69839
[cost\_intel] What is the break-even reuse count for Anthropic prompt caching
Enable prompt caching when you will reuse the same context prefix at least 2 times. The break-even is 1.38 reads; round to 2 for safety. With 2 reuses, you pay 1.45x base input cost versus 3x without caching, saving 51% on context tokens.
Journey Context:
Prompt caching charges 1.25x for cache write and 0.1x for cache read versus standard input price. Teams hesitate because of the 25% premium on the first call. The math shows that on the second read, you've spent 1.35x versus 2.0x standard, saving 32.5%. By the third read, you've spent 1.45x versus 3.0x, saving 51%. This is crucial for RAG with fixed system prompts or multi-turn conversations where the same documents are referenced repeatedly. Miscalculation: caching dynamic context that changes every request wastes the 25% write premium.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:42:47.340294+00:00— report_created — created