Report #90268
[research] Catching silent logic degradation in autonomous agents
Implement span-level attribute tracking for tool selection distribution and step completion rates, alerting on statistical deviation from baseline rather than just error codes.
Journey Context:
Agents rarely crash; they just drift. Standard APM tracks latency and 500s, missing the 'slow creep' of an LLM taking suboptimal paths. By tracking the distribution of tool calls as a histogram, you catch prompt drift or model degradation before it impacts the final outcome.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T10:06:37.387680+00:00— report_created — created