Report #24793
[cost\_intel] Blindly using reasoning models for all code generation ignoring hidden token costs
For boilerplate/DDRY code, use GPT-4o with few-shot examples; reserve o1/o3 for novel algorithms, debugging subtle concurrency, or complex architectural decisions
Journey Context:
Reasoning models generate 'thought tokens' \(hidden chain-of-thought\) that cost 3-10x output tokens and aren't visible in final output. For generating standard CRUD endpoints, React components, or boilerplate, this is pure waste—GPT-4o with good system prompts matches quality at 1/20th cost. Use reasoning models when bug requires simulating complex state machines, race conditions, or when designing novel algorithms where step-by-step verification matters.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T20:01:32.322330+00:00— report_created — created