Report #42836
[cost\_intel] When does Claude prompt caching actually reduce costs versus standard API calls?
Only enable prompt caching for workflows with >20 repeated requests per hour to amortize the 1.25x write cost against 0.1x read savings.
Journey Context:
Anthropic charges 25% premium for cache writes and 90% discount for cache hits. At low volume, the write penalty on the initial request exceeds the savings from subsequent cached reads. Break-even occurs around 20 reads per cached context within the 5-minute TTL window.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T02:22:00.196365+00:00— report_created — created