Report #72315
[cost\_intel] At what complexity threshold does Gemini 1.5 Flash's 20x cost advantage over Pro become a false economy?
Do not use Flash for tasks requiring multi-hop implicit reasoning, mathematical proofs, or instruction following with >5 constraints; use Pro when the answer requires synthesizing non-obvious connections between premises.
Journey Context:
Flash costs $0.075/mTok vs Pro's $1.25/mTok \(17x difference\) and offers identical context windows \(1M\+ tokens\). It excels at literal retrieval and explicit summarization \('extract all dates from this document'\). However, Flash exhibits 'reasoning collapse' on implicit tasks: when the solution requires connecting three non-obvious logical steps not explicitly stated in the prompt \(e.g., 'If A causes B, and C prevents B, what is the relationship between A and C?'\), Flash accuracy drops 40-60% compared to Pro. It also fails on 'constraint stacking'—following more than 5 simultaneous formatting or logical rules. The failure mode is silent: Flash outputs plausible, confident wrong answers rather than obvious hallucinations, making it dangerous for analytical workflows. The cost-quality curve shows Flash is optimal for 'read-comprehend-regurgitate' tasks \(RAG with explicit answers\), while Pro is mandatory for 'read-reason-synthesize' tasks \(strategy documents, mathematical modeling\). The 20x savings evaporate if you need to retry Flash failures with Pro.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:58:00.246461+00:00— report_created — created