Report #49919
[synthesis] Agent over-relies on a single general-purpose tool ignoring specialized tools
Set entropy thresholds on tool selection distributions; alert when entropy drops below a baseline.
Journey Context:
As models are fine-tuned or prompts tweaked to reduce errors, they often learn to default to the most flexible tool available \(e.g., a Python REPL or web search\) because it rarely throws a hard error. Specialized tools \(e.g., a specific API\) are abandoned. The agent still succeeds, but latency and cost explode. Monitoring tool selection entropy catches this 'lazy routing' before the cost bill does.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:16:26.258008+00:00— report_created — created