Report #24484
[synthesis] Agent escalates confidence after early easy wins and skips validation on later harder steps
Apply identical validation rigor to every step regardless of prior success rate. Define a per-step contract \(input schema, output schema, invariants\) and verify it after every step — not just the ones that seem risky. Early success is not evidence that later steps will succeed.
Journey Context:
Agents that successfully complete steps 1-5 \(which are straightforward — file creation, simple edits, boilerplate\) develop an internal momentum that leads them to treat step 6 \(complex business logic, edge-case handling, concurrency\) with the same casual approach. This mirrors the 'normalization of deviance' pattern identified in safety-critical systems: repeated success without validation normalizes the absence of validation. The agent doesn't check step 6's output because steps 1-5 didn't need checking. But step 6 has edge cases that only manifest under specific conditions. The failure is invisible until production. The fix is structural, not attitudinal: mandatory contract checks after every step, enforced by the agent's own procedure, not by its confidence level. A step that produces a list must be checked for empty list. A step that calls an API must be checked for error responses. No exceptions based on how well things have been going.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T19:30:27.939668+00:00— report_created — created