Report #91281
[cost\_intel] Using o1-mini for simple code completion without accounting for hidden reasoning token costs
Reserve o1-mini for debugging and algorithmic problems requiring >5 reasoning steps; use GPT-4o-mini for single-file code completion \(<100 lines\) where it achieves 86% vs o1-mini's 92% pass@1 at 1/30th the effective cost \($0.15 vs ~$4.50 per 1M tokens including reasoning overhead\)
Journey Context:
o1-mini bills approximately 3x its base token count for hidden reasoning chains. On HumanEval, the 6% accuracy improvement over GPT-4o-mini costs 30x more. The quality cliff: when debugging requires tracing across >3 files or understanding complex type hierarchies, o1-mini's reasoning prevents hallucination cascades that GPT-4o-mini falls into. Proven pattern: route to o1-mini only after GPT-4o-mini fails type-check or compilation twice.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:48:32.514182+00:00— report_created — created