Report #91297
[cost\_intel] Defaulting to Gemini 1.5 Pro for all code generation due to longer context window
Gemini 1.5 Flash matches Pro on HumanEval \(74.4% vs 74.9%\) at 10x lower cost \($0.35 vs $3.50 per 1M input tokens\); use Flash for single-file edits and <32k context, switching to Pro only for multi-repo architecture decisions >100k tokens
Journey Context:
Flash is optimized for throughput, not just cost. The quality cliff appears in 'planning' tasks requiring >10 reasoning steps or tool chaining; Flash hallucinates API parameters more frequently. For line-by-line completion, Flash is actually preferred by developers in blind tests \(faster, less over-engineering\). The 10x cost difference means a 10M token pipeline costs $35 vs $350. Proven pattern: route to Pro only when context exceeds 64k or when previous Flash generation fails type-check twice.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:50:10.812722+00:00— report_created — created