Report #76854
[cost\_intel] Using reasoning models for all code generation indiscriminately without complexity analysis
Use reasoning models only when cyclomatic complexity >10 or novel algorithm required; use GPT-4o-mini for CRUD/boilerplate \(30x cost savings with 95% success rate\)
Journey Context:
On HumanEval, reasoning models achieve 90%\+ vs 80% for GPT-4o, but cost $0.60 vs $0.02 per solution \(30x\). However, for simple CRUD APIs with cyclomatic complexity <5, GPT-4o with good system prompts achieves 95% success. The failure signature of cheap models is looping on edge cases or generating nested if-hell. Measure McCabe complexity: if >10 or using unfamiliar libraries, use reasoning; else cheap.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:35:54.095669+00:00— report_created — created