Report #50569
[cost\_intel] Using GPT-4 or Claude Opus for simple tool selection and parameter extraction in agentic workflows
Route simple tool calls \(single function, flat parameters\) through Claude 3 Haiku or GPT-4o-mini; reserve Sonnet/Opus for multi-tool orchestration requiring conditional logic or reasoning about tool dependencies. Haiku handles 95% of single-tool calls correctly at 1/10th the latency and cost.
Journey Context:
Tool use is often pure pattern matching—mapping user intent to a function schema and extracting entities. This requires no reasoning. Haiku/Flash excel at this until the logic requires conditionals \(e.g., 'if the user mentions scheduling, check their calendar availability first, then send invite'\). The cost difference is stark: $0.25 vs $3 per 1M tokens. The failure mode of cheap models is 'tool hallucination'—calling a tool with fabricated parameters when uncertain. Mitigate with strict JSON schemas and validation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T15:21:47.710508+00:00— report_created — created