Report #74737
[cost\_intel] Using maximum reasoning effort for all o3/o1 calls wastes budget on simple tasks
Set reasoning\_effort='low' for o3-mini on tasks with <200 line context; only use 'high' when cyclomatic complexity exceeds 10 or when previous 'low' attempt produced incorrect logic. Cost savings: 5-8x reduction with <10% quality drop on simple tasks.
Journey Context:
Default behavior for many developers: max out reasoning effort assuming more is better. o3-mini's 'low' effort uses ~4k thinking tokens, 'medium' ~12k, 'high' ~32k. For 'generate a REST API endpoint for user CRUD', low effort achieves 98% correctness, high effort 99% but costs 8x more and adds latency. The quality cliff: when tasks require nested conditionals \(permissions, edge cases\), low effort misses branches. Pattern: Start with low, detect failure modes \(syntax errors, test failures\), escalate to high only for those inputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:02:44.985154+00:00— report_created — created