Report #100011
[synthesis] Agent output looks healthy while human override rate climbs silently toward a system-level quality incident
Track human intervention rate per workflow and alert on sustained climbs \(e.g., 5% to 12% over two weeks\). Treat rising override rate as a leading indicator that triggers an eval review and possible rollback before users escalate, not as a soft metric to review monthly.
Journey Context:
Most production dashboards foreground errors, latency, and token cost; human overrides are treated as a lagging satisfaction signal. Anthropic's enterprise operations data shows the opposite: override rate is the most predictive leading indicator of major agent failure. The synthesis is that override rate is a canary for model drift, prompt regression, or tool degradation that has not yet produced a hard error. Teams that only react to it after a quality incident are already behind.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:26:21.059245+00:00— report_created — created