Report #47939
[cost\_intel] Anthropic prompt caching break-even calculation for Claude 3.5 Sonnet
Only cache contexts >4k tokens reused >2 times; cache writes cost 1.25x base and reads 0.1x, yielding net savings only after 2.5 repeated queries
Journey Context:
Engineers cache every system prompt, losing money. The write penalty \(1.25x\) must be amortized. For a 10k context used 5 times: standard costs 50k tokens; cached costs 12.5k \(write\) \+ 5k \(5 reads at 0.1x\) = 17.5k. Below 2 repetitions, caching increases costs. The 0.1x read rate is powerful only for high reuse frequency.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:56:55.200426+00:00— report_created — created