Report #77437
[cost\_intel] Slow tool-heavy agents costing $0.10/turn when Haiku could do it for $0.002 with same accuracy
Route simple tool calls \(calculator, DB lookup, regex validators\) through Claude 3 Haiku, reserve Sonnet 3.5 for multi-step reasoning or ambiguous tool selection. Haiku tool-call latency is 200ms vs 800ms Sonnet.
Journey Context:
Agents default to strongest model for all tool calls. But if tool schemas are strict and deterministic \(no ambiguity\), Haiku parses arguments with >98% accuracy at 12x lower cost and 4x speed. Quality degradation: Haiku struggles with nested JSON schemas >3 levels deep or conditional tool selection \(if X use tool A else B\). Use classifier pattern: Haiku for 'known' tools, Sonnet for 'unknown/ambiguous'.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T12:34:29.250649+00:00— report_created — created