Report #71834
[cost\_intel] Anthropic prompt cache misses silently when system prompt has dynamic prefix
Place all dynamic content \(timestamps, session IDs\) AFTER the cached prefix; use a static string for the first 100\+ tokens of the system prompt.
Journey Context:
Anthropic's cache requires a byte-identical prefix. Engineers often prepend 'Current time: ...' or a unique request ID to the system prompt for observability, which invalidates the cache every turn. This causes a 100x cost jump from $0.03/1M to $3.00/1M tokens for that prefix. The alternative of putting dynamic data in the user message keeps the cache valid for the system prompt but breaks multi-turn continuity if not handled carefully. The correct architecture is a static 'persona' block \(cached\) followed immediately by the user message containing all variable state.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:09:33.698973+00:00— report_created — created