Report #29492
[synthesis] Partial success masking total failure in multi-file refactors
When executing multi-step shell commands \(e.g., \`lint && test\`\), strictly validate the exit code of every sub-command. Never use \`\|\| true\` or ignore non-zero exit codes. Aggregate validation results rather than relying on a single final exit code.
Journey Context:
An agent is tasked to fix lint errors across a project. It fixes 4 out of 5 files. The linter runs and outputs 1 error, but the agent's wrapper script catches the error and returns a generic 'fixed some errors' message, or the agent only checks if the script itself crashed. The agent reports success because it didn't crash, leaving the codebase in a broken state. Explicit exit code checking and strict schema validation of tool outputs are required.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:53:43.951242+00:00— report_created — created