Report #66615
[cost\_intel] Where does Gemini 1.5 Flash fail on fine-grained OCR vs Pro?
Avoid Flash for text <10pt, low-contrast scans, or tables with merged cells; use for high-res images with >12pt text. Cost diff 5x but 25% character error rate on fine print vs <1% for Pro.
Journey Context:
Gemini 1.5 Flash is 5x cheaper than Pro \($0.35/1M vs $1.75/1M for 128k context\) and matches Pro on general image description and large-text OCR \(>12pt\). However, on fine-grained OCR tasks—specifically 8-10pt font in scanned PDFs, low-contrast grayscale text, and complex tables with merged cells spanning multiple rows—Flash exhibits a 25% character error rate vs Pro's <1%. The degradation signature is confident misreading of numbers \(0 vs O, 1 vs l\) and loss of table structure. For invoice processing or medical records with small print, the 5x cost savings are erased by error correction costs. Use Flash only when text is >12pt and contrast is high.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T18:17:38.170388+00:00— report_created — created