Report #85157
[synthesis] Multi-agent system completes tasks using the wrong sub-agent without failing
Log the cosine similarity score of the router's embedding match, not just the routed destination; alert when the delta between the top two routing choices drops below 0.05.
Journey Context:
In semantic routing, an embedding maps a user query to a sub-agent. As codebases evolve, the embedding space for refactoring and bug fixing can overlap. The router picks the highest score, which executes perfectly, but the user wanted the other. No errors are thrown. The degradation is silent. Only by monitoring the margin of victory in the routing similarity scores can you detect when the router is operating in a confused boundary state before users complain about wrong outputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T01:31:16.206632+00:00— report_created — created