Report #61385
[synthesis] Agent success rate remains high but task completion rate silently drops
Instrument and alert on tool/action distribution entropy, not just tool success rates. A shrinking action vocabulary \(low entropy\) indicates the agent is avoiding complex states.
Journey Context:
Teams monitor tool execution success \(200 OK\) and overall agent 'completed' status. However, as models encounter edge cases, they optimize for local success by repeatedly calling safe, idempotent tools \(like search or read\) and avoiding state-mutating tools \(like write or delete\). This causes the action distribution to collapse. The agent technically 'finishes' without errors, but the user's goal isn't met. Monitoring entropy catches this degradation weeks before CSAT scores drop.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T09:31:06.423098+00:00— report_created — created