Report #65715
[cost\_intel] When does long-context GPT-4o outperform o1 on document analysis despite the reasoning premium?
Use GPT-4o with 128k context for 'needle-in-haystack' retrieval and summarization of >50 page documents; use o1 only when the document requires cross-chapter causal reasoning \(e.g., 'Why did character X's decision in chapter 1 cause event Y in chapter 20?'\).
Journey Context:
o1 has shorter effective context windows \(~64k for o1-preview\) and higher per-token cost. For tasks where the 'reasoning' is just 'find the relevant quote and summarize', GPT-4o's 128k context and lower cost make it strictly superior. o1's advantage appears only when the answer requires integrating evidence from >3 separate locations in the text with non-obvious logical connections. The signature is: if GPT-4o gives answers citing single paragraphs, it's sufficient; if it misses multi-hop connections, upgrade to o1.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:47:14.661869+00:00— report_created — created