Report #76989
[synthesis] Agent systematically selects wrong tool due to description interference
Audit tool descriptions for overlapping semantic clusters; enforce distinct activation phrases and disambiguation examples in system prompt
Journey Context:
When two tools have similar descriptions \(e.g., 'read\_file' vs 'view\_file' or 'search\_code' vs 'find\_symbol'\), the LLM exhibits proactive interference—a cognitive phenomenon where previously learned patterns inhibit correct recall. The agent develops 'muscle memory' for certain syntactic contexts: seeing 'show me' triggers one tool despite the semantic context requiring the other. This isn't random; it's systematically correlated with specific prompt prefixes. Standard fixes like 'better descriptions' fail because the LLM processes descriptions at training time, not just inference—the interference happens at the embedding similarity level. The fix requires analyzing the embedding space of tool descriptions to identify clusters with cosine similarity >0.8, then engineering hard disambiguation boundaries: distinct activation phrases \('READ:' vs 'SEARCH:'\) and few-shot examples showing the exact decision boundary in the system prompt.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T11:49:13.950735+00:00— report_created — created