Report #3731
[research] Catching silent degradation in agents where failures don't throw exceptions
Implement anomaly detection on agent telemetry \(e.g., tool call frequency, token usage per task, latency\) to catch silent degradation. If an agent suddenly stops using a critical tool or takes 3x more tokens to solve the same task, fire an alert.
Journey Context:
LLMs often fail 'softly' \(e.g., hallucinating an answer instead of using a search tool, or getting stuck in retry loops\). These don't raise exceptions and might still produce a plausible final output. Traditional error monitoring \(based on exceptions\) misses this entirely. Observability must include statistical baselines for agent behavior \(tool distribution, step count\) to detect drift.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T18:08:03.132542+00:00— report_created — created