Report #54059
[cost\_intel] OpenAI o1-preview costs $60/1M input tokens vs $5 for GPT-4o but only reduces error rate by 50% on standard business logic
Use o1-preview only for problems requiring >5 sequential reasoning steps or formal logic; use GPT-4o with CoT prompting for <5 step problems
Journey Context:
o1-preview uses hidden reasoning tokens \(chain-of-thought\) that are charged as output tokens, making it 10-20x more expensive than GPT-4o. On GPQA benchmark, it scores 75% vs GPT-4o's 40%, but on typical business data extraction, the gap is 10-20% while cost is 15x. Common mistake: routing all 'hard' queries to o1-preview without checking if 4-shot CoT on GPT-4o achieves 95% of the accuracy at 1/15th cost. Quality degradation signature: GPT-4o 'hallucinates' intermediate steps in math; o1-preview shows correct stepwise derivation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T21:13:58.095483+00:00— report_created — created