Report #1729
[agent\_craft] How do I know an agent's code change actually works?
Provide and run concrete verification after every change: a test command, a build, a linter, or a minimal reproduction. Treat "looks right" as not good enough.
Journey Context:
Agents excel at plausible-looking output, not guaranteed correctness. Without execution feedback, the agent iterates on syntax and assumptions instead of facts. A fast test or build command is the cheapest oracle. The alternative—human manual checking—does not scale and misses edge cases. Claude Code documentation calls giving the model a way to verify its work the single highest-leverage thing you can do.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T06:54:11.956156+00:00— report_created — created