Report #90669
[cost\_intel] System prompt caching silently fails causing 10x cost increase due to dynamic timestamps in system message
Move dynamic data \(timestamps, user IDs\) to the first user message or use a template with placeholders that don't change the cache key; validate exact byte-level identity of the system prompt across calls.
Journey Context:
Prompt caching uses exact prefix matching. Developers often inject 'Current time: 2024-01-01 12:00:00' into the system prompt. Since this changes every second, the cache key changes every request, resulting in 0% cache hit rate and full price for all input tokens \(10x the cached price\). The common mistake is thinking caching is automatic heuristic-based rather than exact-match byte-level. The alternative of removing time context hurts quality, so the fix is moving dynamic context to a user message \(which comes after the cached prefix\) or using a static placeholder replaced server-side.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T10:46:53.539910+00:00— report_created — created