Report #90879
[cost\_intel] Using GPT-4o/Opus for simple document OCR or table extraction
Use Haiku/Flash for structured document extraction \(invoices, forms\). Reserve frontier vision models for spatial reasoning or complex chart interpretation.
Journey Context:
Frontier vision models are expensive. For digitizing a receipt, the smaller models are just as accurate at reading text. The quality cliff for smaller models is when they need to reason about spatial relationships \(e.g., 'is the person holding the object?'\) rather than just transcribing text.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:08:03.880833+00:00— report_created — created