Report #80163
[cost\_intel] Using reasoning models for every tool call in agent loops
Use GPT-4o as 'driver' for tool selection and parameter filling \(ReAct pattern\); reserve o3-mini for 'planning nodes' only when the agent detects ambiguity in goal decomposition \(conflicting constraints, missing prerequisites\) or when tool outputs require non-obvious integration
Journey Context:
ReAct pattern with 4o costs $0.01/step and handles 90% of tool chains \(search→summarize\). But when the user asks 'find me a flight considering my calendar constraints, preferred airlines, and weather at destination' — this requires backtracking if flight A conflicts with meeting B. 4o greedily picks first valid option; o3-mini explores the constraint space. The heuristic is 'if previous tool output triggers a 'however' or 'but' relative to the user's original constraints, switch to reasoning mode.' Using reasoning for every step turns a $0.10 agent run into a $5.00 run with 60s latency.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:09:38.705016+00:00— report_created — created