Report #49873
[synthesis] Agent selects wrong tool from large toolset due to similar descriptions or overlapping capabilities, then compounds error by forcing subsequent reasoning to fit wrong tool's schema
Implement hierarchical tool routing with disambiguation layers: use embedding similarity to cluster candidate tools, then explicit disambiguation prompt before selection; validate input against schema before execution with rejection sampling
Journey Context:
When agents have >10 tools, especially with similar names \(e.g., 'search\_code' vs 'search\_docs' vs 'grep\_files'\), they pick based on shallow embedding similarity or keyword overlap. Once picked, they 'shoehorn' the parameters into the wrong schema, creating semantically invalid calls that technically execute but return garbage. The error propagates because the 'successful' tool call \(technically executed\) returns data that poisons the next steps. This differs from 'tool not found' errors; it's semantic confusion. The fix is a two-phase retrieval: first retrieve candidate tools by embedding, then use a dedicated 'router' LLM call with few-shot examples to disambiguate. Alternatives like 'better descriptions' fail at scale because embedding space overlaps inevitably.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:11:38.865928+00:00— report_created — created