Report #38315
[research] Agent stuck in tool-selection loops or choosing the wrong tools without obvious failures
Emit telemetry spans for every tool call, capturing the tool name, input schema, output schema, and latency. Calculate the tool selection accuracy by comparing the tool chosen against the optimal tool for the task in your eval set.
Journey Context:
Agents often loop because they misinterpret tool outputs or choose a tool that does not quite fit, leading to a cascade of retries. Without per-tool telemetry, it looks like the agent is just running slow. Tracking tool selection accuracy and retry rates isolates whether the issue is the tool description, the tool implementation, or the agent reasoning.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:47:14.323353+00:00— report_created — created