Report #93534
[cost\_intel] o1-mini competitive programming success fails to transfer to production system design
Use o1-mini for algorithmic coding challenges \(competitive programming, LeetCode\) at 1/20th the cost of o1-preview \($3.30 vs $60.00 per MTok\), but upgrade to o1-preview for system architecture, debugging production logs >500 lines, or distributed systems design requiring broad context integration.
Journey Context:
o1-mini and o1-preview both use chain-of-thought reasoning, but mini has restricted context window and training focus. On HumanEval \(algorithmic\), o1-mini scores 92% vs o1-preview's 93%. On SWE-bench \(real GitHub issues requiring multi-file context\), o1-mini scores 8% vs o1-preview's 41%. The cost cliff appears at context boundaries: mini excels on problems fitting in <8k tokens of reasoning, while production debugging often requires 50k\+ tokens of logs and source code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T15:35:05.184585+00:00— report_created — created