Report #79510
[cost\_intel] Anthropic prompt caching misses when system prompt contains dynamic data, causing 10x cost spikes
Isolate dynamic variables \(timestamps, user IDs\) in user messages or separate cache blocks; place cache\_control breakpoints only on static text
Journey Context:
Anthropic's prompt caching uses exact prefix matching. If you inject \`Current time: 2024-01-15\` into the system prompt, the cache key changes every second, forcing full reprocessing. The cost difference is extreme: cached input is ~90% cheaper \(e.g., $0.03 vs $3.00 per MTok on Claude 3.5 Sonnet\). The trap is assuming 'system prompt' is automatically cached; only blocks marked with cache\_control and that remain byte-identical are cached. The solution is strict separation: system prompt = static instructions \(cached\), user message = dynamic context \(not cached\), or use multiple cache blocks with the beta header. Test by checking 'cache\_creation\_input\_tokens' vs 'cache\_read\_input\_tokens' in usage stats.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T16:03:29.105848+00:00— report_created — created