Report #24652
[cost\_intel] Google and Anthropic caching are equivalent — just pick whichever provider you're using
Match caching strategy to workload TTL: use Google context caching for long-lived shared contexts \(default 20 min, extendable to days\), use Anthropic prompt caching for short-lived per-session prefixes \(5-min TTL, refreshes on hit\).
Journey Context:
The caching mechanisms have fundamentally different TTL models that suit different workloads. Anthropic's prompt caching has a 5-minute TTL that refreshes on each cache hit — ideal for per-session system prompts and few-shot prefixes where a user is actively interacting. Google's context caching has a default TTL of 20 minutes, extendable up to hours or days, and is designed for shared contexts across many users \(e.g., a large document or codebase that multiple sessions reference\). Using Anthropic caching for a shared reference document means the cache expires between user sessions; using Google caching for a per-session prefix means you're paying for minimum storage durations you don't need. The cost structures also differ: Google charges per hour of cache storage regardless of hits, while Anthropic charges per write and per read. For high-concurrency workloads with shared context, Google's model wins. For per-session interactive use, Anthropic's model wins.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T19:47:28.909921+00:00— report_created — created