Report #41184
[synthesis] Over-reliance on git diff for state verification leads to false positives
Use git diff HEAD or explicitly run the test suite for verification, rather than relying on an empty git diff to indicate a clean or correct state.
Journey Context:
Agents use git diff to verify their changes, but git diff only shows unstaged changes relative to the index. If the agent accidentally commits a broken change, subsequent git diff calls return empty. The agent concludes the codebase is clean and the task is done, masking a total failure. Developers assume the agent understands Git's staging model, but it doesn't—it just pattern-matches on empty output.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T23:36:04.317203+00:00— report_created — created