Report #52187
[cost\_intel] GPT-4o mini vs GPT-4o vision cost tiers for OCR vs spatial reasoning
Use GPT-4o-mini for text-dense image OCR and document extraction at 1/20th the cost; reserve GPT-4o for charts, diagrams, and spatial reasoning tasks.
Journey Context:
Mini models match pro models on 'read the text in this screenshot' because OCR is a low-level perceptual task. However, interpreting charts requires understanding spatial relationships and numerical reasoning, where mini models hallucinate or miss trends. OpenAI's system card shows mini at 98% of pro accuracy on text transcription but only 70% on visual question answering with charts. The cost difference is $0.0003 vs $0.005 per image.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:05:22.351693+00:00— report_created — created