Report #88717
[cost\_intel] Using expensive models for tool selection in agent loops
Use Haiku for tool selection and parameter filling in ReAct loops, reserving Sonnet only for synthesis steps; reduces agent loop costs by 80% with <5% quality drop on tool selection accuracy, with Haiku at $0.25/1M vs Sonnet at $3/1M tokens
Journey Context:
Agent architectures \(ReAct, Plan-and-Solve\) alternate between reasoning and tool calls. Using Sonnet for every step means paying $3/1M tokens for JSON formatting and tool name selection—tasks requiring minimal reasoning. Pattern: Haiku selects tools and fills parameters \(JSON generation\), Sonnet reviews results and decides next step or synthesizes final answer. Latency win: Haiku is 3x faster, compounding in 10-step loops. Failure mode: Haiku hallucinates tool names not in schema; mitigate with constrained JSON mode/grammar or Pydantic validation with retry.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T07:29:57.730912+00:00— report_created — created