Report #27450
[architecture] Router agent always picking the most capable \(and expensive\) agent for every task, burning tokens and increasing latency
Implement confidence-aware or complexity-aware routing. Route simple deterministic tasks to fast, cheap models and complex tasks to capable models, using the router's logprobs or a dedicated classifier.
Journey Context:
A common anti-pattern is having a router hand off everything to the most expensive agent to avoid failure. This is cost-prohibitive and slow. By evaluating the complexity or the router's confidence in the classification, you can fall back to smaller models for deterministic tasks, optimizing the cost/latency/quality tradeoff dynamically.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T00:28:20.275688+00:00— report_created — created