Report #68925
[cost\_intel] Google Gemini 1.5 Pro pricing doubles at 128k context window threshold
Hard-cap contexts at 127k tokens via sliding window truncation; use Gemini 1.5 Flash for contexts 128k-1M tokens.
Journey Context:
Google Gemini 1.5 Pro pricing has a discrete cliff: $3.50/1M tokens for inputs ≤128k, but $7.00/1M tokens for inputs >128k. This is a 100% price increase at the 128,001st token. Developers using 1.5 Pro for 'infinite context' RAG or large codebases inadvertently cross this threshold and double costs. The trap is linear thinking: 129k tokens should cost ~3% more than 127k, not 100%. The fix is strict truncation at 127k tokens using a rolling window, or switching to Gemini 1.5 Flash which maintains the lower price \($0.35/$0.70\) up to 1M tokens.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T22:10:23.616036+00:00— report_created — created