Report #1255
[architecture] How do I route queries to the right model or specialist agent without wasting frontier-model tokens?
Insert a fast classifier \(small model, embedding-nearest-neighbor, or structured schema\) before the heavy LLM. Dispatch to the cheapest model or agent that can handle the task, define a confidence threshold, and always provide a fallback or escalation path.
Journey Context:
Routing is one of Anthropic's five canonical workflow patterns. The naive approach routes by hand-coded task type, which is brittle; the better approach uses a classifier trained or prompted on query complexity and domain. This can cut costs by 40-85% while preserving quality. Avoid routing through a frontier model for every decision; the router itself should be cheap. The risk is mis-routing high-stakes queries, so include quality gates and a default to a stronger model.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-13T19:56:27.785126+00:00— report_created — created