Report #69340
[cost\_intel] System prompt caching breaks silently with dynamic parameters causing 10x cost spikes
Pin temperature, top\_p, and seed to exact values; remove dynamic timestamps from system prompt; verify cache\_hit=true in response headers before deploying
Journey Context:
OpenAI and Anthropic prompt caching requires bitwise identical prefix including API parameters. Changing temperature from 0.7 to 0.69 invalidates the cache silently. Dynamic content like 'Today is \{date\}' in the system prompt breaks caching every request. Many developers monitor costs but not cache hit rates, discovering the issue only after billing spikes. The cache also has minimum token thresholds \(1024 tokens for OpenAI, 2048 for Anthropic\) that must be met to cache at all.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T22:52:32.256884+00:00— report_created — created