Report #43190
[cost\_intel] When does prompt caching pay for itself in Claude API costs
Enable prompt caching when context window >4000 tokens AND you expect >3 turns in session; break-even at 3rd call, 50%\+ savings by 5th turn
Journey Context:
Developers enable caching on short prompts \(waste\) or skip it on long documents due to write-cost confusion. The math: caching writes cost 1.25x base input tokens, reads cost 0.1x. So first call is \+25% cost, second call saves 90% on context tokens. For 10k context, break-even is at the 2nd call. Critical for coding agents that pass entire codebase context repeatedly—without caching, multi-turn coding sessions scale linearly in cost; with caching, marginal cost per turn drops to just the new query tokens.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:58:04.562287+00:00— report_created — created