Report #45565
[cost\_intel] o3-mini overkill for boilerplate code generation despite 4x cost premium
Use GPT-4o for CRUD scaffolding and simple bug fixes; reserve o3-mini only for algorithms with cyclomatic complexity >10 or recursive logic.
Journey Context:
o3-mini costs 3-4x GPT-4o and adds 5-20s latency. On SWE-bench Verified, o3-mini shows only 3% higher pass@1 than GPT-4o on single-file 'simple' bugs, but the gap widens to 40%\+ on multi-file reasoning tasks. The cost-per-correct-answer on simple tasks is $0.12 \(o3-mini\) vs $0.04 \(GPT-4o\). Quality signature: if GPT-4o fails 3 consecutive attempts or produces cyclomatic complexity >10, escalate to o3-mini.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T06:57:28.562184+00:00— report_created — created