Report #66394
[cost\_intel] Anthropic prompt caching prefix invalidation causes silent 10x cost spike on dynamic system prompts
Structure prompts as \[static 4k\+ token prefix\] \+ \[dynamic suffix\]; never prepend timestamps, session IDs, or random seeds before the cached block.
Journey Context:
Anthropic's cache requires an exact byte-level prefix match. If you inject a dynamic timestamp at the start of the system prompt \(e.g., 'Today is 2024-01-01'\), the cache misses 100% of the time. You pay the full prompt processing cost \($3-15/1M tokens depending on model\) instead of the cache read price \($1.25/1M\). The fix is immutable prefixes: keep the first 4096\+ tokens identical across requests, appending all dynamic data after that boundary.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:55:26.314284+00:00— report_created — created