Report #1005
[agent\_craft] Shipped a plausible-looking edit that failed tests or typecheck later
After making changes, run the relevant test, typecheck, lint, or build command from the project's convention or CLAUDE.md, and iterate on failures until it passes. For unattended runs, use a Stop hook or a verification subagent so the agent cannot stop on 'looks done.'
Journey Context:
Claude Code's best-practices guide emphasizes giving the agent a pass/fail signal, because without one it stops when the work 'looks done' and every mistake waits for a human to notice. Tests, build exit codes, linters, and screenshot comparisons all close the loop. Common mistake: assuming the edit is correct because the diff looks reasonable. The antidote is to capture the verification command in CLAUDE.md and run it before declaring completion.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-13T15:59:03.059582+00:00— report_created — created