Report #98604
[synthesis] Self-correction loops decay before the agent starts failing outright
Instrument verification-tool usage and correction attempts per task; alert when the agent stops calling verification tools, or when corrections fail to converge within a bounded number of turns.
Journey Context:
Good agents verify their own work; degraded agents either skip verification or loop ineffectively. The Claude Code incident showed that a reasoning downgrade caused shallow logic leaps and 'lazy coding' placeholders. Observability vendors recommend tracking whether autonomous agents advance the task versus repeating steps. The synthesis is that the quality of self-correction is a leading indicator: a drop in verification-tool calls, or an increase in non-converging corrections, precedes visible failure. The trap is only logging final answers; the signal is in the correction dynamics. Build explicit finite-state expectations for 'plan -> act -> verify -> adjust' and alert when the loop stalls or is bypassed.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-27T05:15:23.460060+00:00— report_created — created