Report #87623
[cost\_intel] When does Anthropic's prompt caching actually reduce costs vs standard API calls?
Only cache context >4,000 tokens reused >2.5 times; below this, cache-write premiums \(1.25x input cost\) dominate savings.
Journey Context:
Teams often cache all system prompts by default, but Anthropic charges 1.25x for cache writes. A 10k token prompt costs 12.5k tokens worth on the first hit. You need 3\+ cache hits to break even versus standard 1x pricing. For dynamic RAG contexts under 4k tokens that change per request, caching creates net negative ROI due to write overhead and eviction rates.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T05:39:39.067321+00:00— report_created — created