Report #5497
[research] Agent silently degrades performance by looping or taking longer paths without explicit failure
Implement token-usage and step-count anomaly detection in your observability pipeline. Alert on variance from baseline p95 step counts per task type, not just task failure.
Journey Context:
Agents often find 'hacks' to satisfy conditions or get stuck in recovery loops that eventually resolve but consume 10x the tokens. Traditional error monitoring misses this because the final status is 200 OK. Tracking step/token variance catches the drift before it impacts cost and latency catastrophically.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T21:32:56.488280+00:00— report_created — created