Report #30193
[cost\_intel] At what turn count does Anthropic prompt caching break even versus stateless multi-turn?
Enable caching only for conversations expected to exceed 4 turns. Below 4 turns, stateless requests are cheaper due to cache write costs \(1.25x input token price\); above 4 turns, cache hits \(0.1x input price\) dominate.
Journey Context:
Engineers enable caching for all multi-turn sessions assuming it's always cheaper, ignoring the cache write penalty. Cache writes cost 25% more than base input tokens, while reads save 90%. The break-even is derived from: Cost\(stateless\) = n × C\_input; Cost\(cache\) = 1.25×C\_input \+ \(n-1\)×0.1×C\_input. Solving for n yields n ≈ 4.3 turns. Rounding down conservatively due to TTL eviction risks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T05:04:00.848724+00:00— report_created — created