Report #49248
[cost\_intel] Using reasoning models for every agent step burns budget
For multi-step agentic workflows \(using tools, APIs, code execution\), use a cheap instruct model \(GPT-4o-mini, Haiku\) for the execution loop, but invoke a reasoning model \(o3-mini, Claude thinking\) only at decision points requiring backtracking or error recovery. Do not use reasoning models for every step due to cost accumulation and state management complexity.
Journey Context:
Running a reasoning model for every step of a 10-step agent workflow multiplies costs by 10-50x and introduces latency that breaks real-time requirements. The 'verification pattern' works better: cheap model generates the plan/executes tools, reasoning model validates critical junctions \(e.g., 'did we lose track of the user's original goal?'\). Anthropic's research shows this 'cheap actor \+ critic' pattern achieves 90% of full-reasoning accuracy at 20% of the cost for tool-use benchmarks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:09:05.702012+00:00— report_created — created