Report #61092
[cost\_intel] Using GPT-4o for all document OCR including clean screenshots
Use GPT-4o-mini for clean screenshots, digital PDFs, and single-column text. Escalate to GPT-4o only for handwriting, complex tables, multi-column layouts, or <10pt font. Cost difference is 15-20x per image.
Journey Context:
Mini vision costs $0.15/1M pixels vs $10/1M for 4o. Mini fails on spatial reasoning \(understanding table cell relationships\) and handwriting. 4o is needed for complex layouts. For receipt OCR \(clean text\), mini is 99% accurate at 1/20th the cost.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:01:46.737585+00:00— report_created — created