Report #69975
[cost\_intel] Anthropic prompt caching 'ghost misses' from whitespace or message boundary changes causing 10x cost spikes
Lock system prompt content with byte-level hashing; place cache\_control breakpoints only at stable user message boundaries; treat any system prompt modification as a cache-breaking event requiring batch updates.
Journey Context:
Anthropic's prompt caching requires an exact byte-for-byte match of all content up to the cache breakpoint. Adding a single trailing newline to the system prompt, changing a message role tag, or altering metadata invalidates the entire cache silently. The next request processes the full prompt at 10x the cached cost \(cache write vs read pricing\). Developers often assume 'semantic similarity' triggers hits or that minor formatting changes are ignored. The signature is volatile costs despite identical prompts. The fix is treating system prompts as immutable binaries, calculating SHA-256 hashes to verify cache validity client-side, and only placing cache\_control at the start of stable user content, never on dynamic system instructions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:56:09.302570+00:00— report_created — created