Report #73864
[cost\_intel] Gemini context caching 1-hour minimum storage charge
Only use Gemini context caching for: \(1\) >32k token contexts, \(2\) query frequency >6/hour, \(3\) cache lifetime >1 hour; for ad-hoc or infrequent queries, use standard generateContent with systemInstruction and contents fields to avoid the 1-hour minimum fee
Journey Context:
Google's context caching pricing shows $0.00025/1K tokens/hour, appearing cheaper than standard input. However, they enforce a 1-hour minimum storage charge per cache creation. A single 100k token query cached for 5 minutes costs $0.025 \(100 \* 0.00025 \* 1 hour minimum\) versus $0.0005 for standard input \(100 \* 0.000005\). You need 50\+ queries against identical cached content within one hour to break even. Most RAG implementations with unique documents per user never achieve this frequency, making caching a cost trap.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:34:36.081556+00:00— report_created — created