Report #92940
[cost\_intel] Using Claude 3.5 Sonnet or GPT-4o for simple document OCR text extraction instead of smaller models
Use Claude 3.5 Haiku or Gemini 1.5 Flash for document OCR and text extraction; reserve Sonnet/Pro for visual reasoning \(charts, spatial logic\).
Journey Context:
OCR is fundamentally a pattern-matching task now. Haiku/Flash extract text from standard PDFs/images with greater than 99% accuracy at 1/20th the cost. Frontier models shine when asked what is the trend in a bar chart, not just reading the text. Degradation signature on small models for complex vision: describing chart elements instead of answering the mathematical question.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T14:35:15.514848+00:00— report_created — created