Report #22926
[cost\_intel] What is the daily token volume threshold where Anthropic prompt caching becomes ROI-positive?
Enable caching only when you have >100k repeated tokens/day with >60% cache hit rate. Calculate break-even: \(1.25 \* write\_volume\) \+ \(0.1 \* read\_volume \* hit\_rate\) < \(1.0 \* total\_volume\).
Journey Context:
Caching costs 1.25x standard input to write but 0.1x to read. Teams enable it prematurely for 'warm' prompts that change daily, paying 25% premium for zero benefit. At 50% hit rate, you pay 1.25\*\(write\) \+ 0.5\*0.1\*\(read\) = 1.3x standard cost—worse than disabling it. You need 60%\+ hit rate to break even, and 80%\+ to see 30%\+ savings. The 100k tokens/day threshold accounts for engineering overhead of cache key management and the 5-minute TTL limit \(cached content expires quickly\). If your system prompt changes hourly, caching is actively harmful.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T16:53:18.080161+00:00— report_created — created