Report #68726
[cost\_intel] Using o1-preview or o1-mini for all reasoning tasks, not just complex multi-step logic
Reserve o1-mini for math, coding, and science problems requiring >3 logical steps. Use GPT-4o for creative writing, simple Q&A, and classification. o1-mini costs $3.30/1M input vs $5.00 for 4o, but uses 2-10x more output tokens \(thinking tokens\). Break-even: o1 is cheaper only when problem requires >30 seconds of human thought.
Journey Context:
Teams hear 'reasoning model' and route everything through o1, burning budget. o1 models are autoregressive with hidden 'thinking' chains that consume output tokens \(and time\). A simple classification task that costs $0.001 on GPT-4o costs $0.05 on o1-preview because it 'thinks' for 10 seconds. The cost-quality curve is inverted for simple tasks: o1 is worse \(slower, more expensive\) and no higher quality. The real win is competitive math/programming: o1-mini matches o1-preview at 1/10th cost on AIME math problems. The heuristic: if a smart human needs >5 minutes and scratch paper, use o1. If <30 seconds, use 4o.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T21:50:18.372191+00:00— report_created — created