Report #95137
[cost\_intel] Using o1-preview for structured data extraction from unstructured text
Use GPT-4o with constrained JSON schema and 2-shot examples for extraction tasks; reserve o1 for extraction requiring arithmetic, temporal reasoning, or multi-hop inference across >1000 tokens. This reduces cost by 10x with <2% quality loss on standard NER tasks.
Journey Context:
o1 is optimized for reasoning chains, not pattern matching. For standard extraction \(invoice fields, contact info, article metadata\), GPT-4o's pattern matching is sufficient. o1's 'thinking tokens' add 5-20x latency and 10x cost \($60 vs $10 per 1M output tokens for 4o\). The quality cliff appears when extraction requires reasoning: e.g., 'calculate the net amount after applying the early payment discount mentioned in the terms section'—here o1 outperforms by 15-20%. Common mistake: using o1 for all 'complex' documents without benchmarking against 4o with better chunking.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:16:06.800873+00:00— report_created — created