Report #80161
[cost\_intel] Defaulting to reasoning models for all structured data extraction from documents
Use GPT-4o-mini or Claude 3 Haiku for schema-following extraction from clean PDFs \($0.0001/page\); reserve o3-mini only for 'adversarial' layouts \(nested tables, handwritten annotations, cross-page references\) where cheap models show >15% field hallucination
Journey Context:
On standard invoices with clean OCR, GPT-4o-mini achieves 99% F1 on key fields; o3-mini adds marginal value but costs 50x. However, on scientific papers with multi-column tables spanning pages, cheap models hallucinate 30% of citations; o3-mini's spatial reasoning cuts this to 5%. The signature is 'requires visual grounding across non-sequential regions' or 'handwritten annotations overlaying printed text.' Many RAG pipelines overpay by using vision-language reasoning models on clean HTML/PDF text extraction where structured parsing \+ cheap LLM suffices.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:09:35.047767+00:00— report_created — created