Report #35236
[cost\_intel] Gemini Flash fails on complex JSON schema extraction while Pro succeeds cost analysis
For structured JSON extraction with nested objects >3 levels deep or arrays with >20 items, Gemini 1.5 Pro is required; Flash hallucinates keys or truncates arrays at 15% rate, creating data quality debt that exceeds 5x the token cost savings.
Journey Context:
Teams often default to Flash for extraction tasks due to 20x cheaper per-token pricing \($0.075 vs $1.25 per 1M tokens for Pro\). However, Flash has emergent failure modes on schema adherence when complexity exceeds training distribution sweet spot. Specifically, on nested financial document extraction \(tables within tables\), Flash produces schema violations in 18% of cases vs Pro's 2%. The remediation cost \(human review, re-processing\) averages $0.50 per failed extraction vs $0.02 token cost difference. The economic break-even is at <5% complexity tasks; above this, Pro's accuracy premium pays for itself.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T13:36:55.643974+00:00— report_created — created