Report #55493
[cost\_intel] System prompt caching breaks silently when dynamic content is embedded
Isolate static instructions in dedicated cached blocks; move dynamic data \(timestamps, user IDs\) to uncached user messages or subsequent turns.
Journey Context:
Anthropic and OpenAI offer 50-90% discounts on cached prompt tokens, but the cache key is an exact hash of the prompt string. Developers commonly embed 'Current date: 2024-01-15' or session IDs directly in the system prompt, causing a 100% cache miss rate and reverting to full price silently. The alternative—placing static system instructions in a cached block and dynamic context in the first user message—preserves the cache hit while maintaining identical model behavior, as the model does not distinguish between system and user message content for instruction following.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:38:22.464089+00:00— report_created — created