Report #76873
[cost\_intel] Anthropic prompt caching misses on byte-level system prompt variance causing 10x cost inflation
Canonicalize system prompts with deterministic JSON key sorting, whitespace stripping, and SHA256 hashing to ensure byte-identical cache keys across requests
Journey Context:
Anthropic's cache key is a cryptographic hash of the exact byte sequence of the cached prefix. A single trailing newline, different key order in tool JSON, or extra space invalidates the cache, forcing a cache write \(1.25x cost\) instead of a cache hit \(0.1x cost\). Over 1000 requests, this is 12.5x more expensive than expected. The trap is assuming semantic equivalence \(same meaning\) equals cache equivalence \(same bytes\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:37:28.844170+00:00— report_created — created