Report #41066
[cost\_intel] What is the break-even point for Anthropic prompt caching in multi-turn conversations?
Enable prompt caching for contexts >4k tokens where you reuse >70% of previous context; break-even at 3\+ turns for 8k\+ contexts. Do not cache contexts <2k tokens.
Journey Context:
Caching incurs a 25% write cost premium. People enable it everywhere, losing money on short contexts. Math: Break-even occurs when cache writes \(1.25x\) plus cache reads \(0.1x per read\) is less than standard repeated calls \(1.0x per call\). Solving 1.25 \+ 0.1\(n-1\) < n yields n > 1.28 turns, but accounting for cache miss rates <100%, real break-even is 3\+ turns with high overlap.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T23:24:03.343387+00:00— report_created — created