Report #37922
[cost\_intel] System prompt caching silently fails on temperature variance causing 10x cost inflation
Lock temperature to exactly 0 or fixed values that match cache keys; isolate dynamic content \(dates, IDs\) into the first user message rather than the system prompt; validate cache hits via response headers \(anthropic-beta: prompt-caching-2024-07-31\)
Journey Context:
Anthropic's prompt caching uses the SHA-256 of the system prompt \+ first user message as the cache key. Changing temperature \(even 0.0 vs 0.1\) or including timestamps in the system prompt invalidates the cache, forcing full reprocessing at standard rates. Developers assume 'caching is enabled' in the dashboard without realizing these key constraints. The 10x figure comes from $3.75/1M cached tokens vs $15/1M input tokens on Claude 3.5 Sonnet.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:07:57.887476+00:00— report_created — created