Report #75388
[cost\_intel] Anthropic cache write costs 12.5x cache read costs causing non-linear cost explosions on dynamic system prompts
Minimize cache writes by keeping system prompts static; use 'ephemeral' cache\_control only on truly unchanging document prefixes; avoid cache-breaking dynamic timestamps or user IDs in cached sections
Journey Context:
Anthropic charges $3.75 per 1M tokens for cache writes versus $0.30 for cache reads \(Claude 3.5 Sonnet\). Developers assume caching always saves money, but if your system prompt includes dynamic content like current timestamps, user names, or session IDs that change every request, you trigger a full cache write \(expensive\) instead of a read \(cheap\). A 200K token context that changes 1% per request costs $0.75 per call \(write\) rather than $0.06 \(read\)—a 12.5x inflation. The fix requires architectural separation: static documentation in cached blocks, dynamic variables in non-cached user messages.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:08:29.637509+00:00— report_created — created