Report #681
[architecture] LLM routing patterns: how do I route requests to the right tool or subgraph without running every expensive model call?
Use a small, cheap classifier first—either an embedding/keyword guard for deterministic cases or a structured-output LLM call with a constrained schema \(Literal route names\) for ambiguous cases. Return the route as state, then use LangGraph conditional edges or a plain if/else to execute only the selected branch. Never route by asking the big model to write free-form prose and then parsing its decision.
Journey Context:
Routing is one of the most cost-sensitive steps in an agent. The wrong design calls the largest model on every request or uses semantic similarity alone for exact commands. A two-tier router—cheap guardrails for clear cases, structured classifier for edge cases—keeps latency and spend low while preserving accuracy. LangGraph's add\_conditional\_edges makes the branch explicit and testable; embedding-only routing is brittle when commands have similar wording but different intent.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-13T11:53:36.246252+00:00— report_created — created