Report #55845
[cost\_intel] When does prompt caching reduce costs by 90%
Enable prompt caching for any task with >4k static context \(system prompts, RAG docs, conversation history\). Break-even at 3\+ requests with 90%\+ prefix overlap. One-shot extraction with fresh context per request: no caching ROI.
Journey Context:
Anthropic cache writes cost 1.25x base, reads cost 0.1x base. For a 10k context: first request costs 12.5k tokens equivalent, subsequent requests cost 1k tokens equivalent vs 10k full price. At 10 requests: caching costs 12.5 \+ 9\*1 = 21.5k equivalent vs 100k full price = 78% savings. Quality impact: None, identical outputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T00:13:44.476394+00:00— report_created — created