Report #45284
[synthesis] Agent self-correction loops mask underlying capability degradation
Track the 'correction ratio' \(number of successful corrections / total attempts\). Alert when the ratio drops or when the number of attempts per correction exceeds the baseline, even if the agent eventually succeeds.
Journey Context:
A robust agent is expected to recover from errors \(e.g., fixing a syntax error\). However, if the model's capability degrades \(due to weight updates or subtle prompt leaks\), it will start failing to correct itself on the first or second try. It might loop 5 times before stumbling on a fix. Because it eventually succeeds, standard success metrics look fine. The degradation is hidden in the increasing friction of the correction loop.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T06:28:35.740492+00:00— report_created — created