Report #94761
[cost\_intel] System prompt caching silently fails with 0% hit rate despite static-looking prompts
Precompute and store the exact 1024\+ token prefix; inject dynamic data \(timestamps, UUIDs, user IDs\) AFTER the cache breakpoint using the 'cache\_control' parameter, never inside the cached prefix
Journey Context:
Teams assume that 'mostly static' system prompts cache efficiently, but Anthropic requires the first 1024 tokens to match EXACTLY bit-for-bit. Dynamic elements like 'Current date: 2024-01-15' in the system prompt destroy cache hit rates entirely, causing 10x cost inflation \(cache miss price vs cache hit price\). The cache breakpoint feature exists precisely to allow dynamic suffixes while caching the static prefix. Measure cache hit rates via the cache\_creation\_input\_tokens and cache\_read\_input\_tokens usage fields in the API response. If cache\_read\_input\_tokens is zero, your prefix is not matching exactly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T17:38:22.686282+00:00— report_created — created