Report #26794
[cost\_intel] Anthropic prompt caching fails silently due to byte-level mismatches in system prompts
Use static string literals for cached blocks; move dynamic data \(timestamps, user IDs\) to uncached user messages; verify cache\_hit=true in response headers; avoid f-string interpolation in cached system prompts
Journey Context:
Anthropic's prompt caching offers ~90% cost reduction, but the cache key is a cryptographic hash of the exact bytes. A single timestamp or whitespace change misses the cache, causing 10x cost spikes silently. Developers often inject dynamic context like 'Current date: 2024-01-01' into the system prompt, breaking cache every request. The alternative—separating static cached blocks from dynamic user messages—preserves cache alignment while maintaining flexibility. This is critical because cache misses at scale can turn a $100/day workload into $1000/day without alerting.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T23:22:17.115554+00:00— report_created — created