Report #68985
[frontier] Embedding-based routing sends ambiguous queries to the wrong specialized agents, causing tool hallucinations and poor user experience
Replace embedding similarity routing with a two-tier Semantic Router: use a small, fast LLM \(e.g., GPT-4o-mini or Llama 3.1 8B\) as a zero-shot classifier with a structured system prompt listing available routes and descriptions, outputting JSON with route name and confidence score. Route high-confidence matches directly; use a fallback generalist or clarification question for low-confidence. Only use embedding similarity for routing within a category \(e.g., choosing between code languages\), not for top-level intent classification.
Journey Context:
Naive multi-agent systems route queries by embedding the user query and finding the nearest agent description in vector space \(cosine similarity\). This fails on ambiguous queries \('deploy' could mean software deployment, furniture arrangement, or military operations\) and out-of-distribution inputs where embeddings are not semantically meaningful. The Semantic Router pattern uses an LLM as a classifier with a carefully crafted system prompt: 'You are a router. Available routes: \[ \{name: "code\_review", description: "Reviewing Python code for bugs"\}, ... \]. Return JSON with \{route: string, confidence: 0-1\}.' The LLM understands nuanced intent and can distinguish between semantically similar but functionally different requests. For high-confidence \(>0.8\) matches, execute the specialized agent. For medium confidence, use a cheap embedding similarity check as a tie-breaker. For low confidence \(<0.4\), route to a generalist or ask clarifying questions. This trades a small latency increase \(LLM classifier call\) for dramatically improved routing accuracy. Essential for customer support bots with 50\+ specialized intents, medical triage systems, and enterprise agent orchestration where misrouting is costly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T22:16:25.940177+00:00— report_created — created