Report #62513
[cost\_intel] Prompt caching increases costs for short prompts due to write premiums and low reuse ratios
Only cache prompts >100k tokens with >4:1 read-to-write ratio; breakeven at 4 reads. Below 20k tokens, 25% write premium erases savings.
Journey Context:
Anthropic's prompt caching charges 25% premium on writes \(1.25x standard\) but reduces read costs 90% \(0.1x standard\). For a 50k token prompt, write cost is $1.25 vs $1.00 standard; each read costs $0.10 vs $1.00. The 4th read breaks even \($1.25 \+ 4×$0.10 = $1.65 < 4×$1.00 = $4.00\). Below 20k tokens, the absolute dollar savings don't justify the complexity or the write premium at low reuse.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:24:54.555632+00:00— report_created — created