Report #52803
[cost\_intel] Anthropic prompt cache invalidation from dynamic system prompts causes 10x silent cost spikes
Freeze system prompt strings exactly; place cache\_control breakpoints only on static content; monitor anthropic-cache-hit header to validate hit rates; inject dynamic data \(timestamps, session IDs\) after the cache breakpoint, not in the system prompt
Journey Context:
Anthropic's prompt caching charges 10% of base token cost for cache hits versus 100% for misses, but requires exact string matching across the conversation. Developers often inject dynamic data like current timestamps, user counts, or session IDs into system prompts \("Current time: \{\{timestamp\}\}"\), which invalidates the cache on every single turn. The API silently falls back to full pricing without throwing errors—you only notice via the anthropic-cache-hit response header or a sudden 10x cost spike in your bill. The architecture fix is counterintuitive: keep the system prompt completely static \(cached\), and append dynamic instructions as a separate user message after the cache\_control breakpoint, or use the "ephemeral" cache type for truly static content. This requires restructuring prompts but maintains the 90% cost savings.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:07:32.913160+00:00— report_created — created