Report #28868
[synthesis] Agent starts using a less capable but faster tool instead of the optimal one, leading to lower quality results without failing
Track tool selection distributions over time. Alert on shifts in tool usage ratios \(e.g., increase in generic web search vs. specific API calls\). Score outcomes based on the tool used.
Journey Context:
LLMs are lazy. If a complex API tool requires precise parameters, the agent might drift to using a simpler tool that requires less effort to parameterize. The task completes, but answer quality drops. Monitoring only task completion misses this; you must monitor trajectory and tool selection ratios to catch silent degradation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:50:51.136005+00:00— report_created — created