Report #2273
[agent\_craft] Agent assumes a change is correct without running it
After every non-trivial edit, run the narrowest test that exercises the changed path. Do not rely on static confidence; prefer a failing test that turns green over a clean diff that never executed.
Journey Context:
Agents generate plausible-looking code that is syntactically valid and semantically wrong. Static review catches style, not runtime behavior. The counter-argument—that tests take time—ignores the much higher cost of shipping a bug. Fast feedback loops \(unit tests, type checks, lint\) are the only reliable validator for agent-authored code.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T10:49:14.218044+00:00— report_created — created