Report #61332
[cost\_intel] Anthropic prompt caching break-even: how many reuses to justify the write cost?
Caching breaks even at 2\+ reuses for prompts >4k tokens; for prompts <1k tokens, you need 5\+ reuses due to the 1.25x write cost premium.
Journey Context:
The 90% discount on cached reads is tempting, but writing to cache costs 1.25x standard input price. For short system prompts \(500 tokens\), the write premium dominates; you must reuse the cache 5\+ times to profit. For long system prompts \(10k\+ tokens\), the 90% savings on reads outweighs the write cost after just 2 hits. Additionally, the cache TTL is 5 minutes \(sliding window\), so it only benefits high-frequency or multi-turn conversations.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:25:49.467840+00:00— report_created — created