Report #2410
[agent\_craft] Agent said the fix was done but tests still fail
Run the smallest relevant test or lint command after every edit and feed failures back into the loop. In Aider use /test and /lint; in Claude Code run the command with Bash. Do not move on until the check passes.
Journey Context:
AI-generated code can look correct while being subtly wrong. The only reliable completion signal is execution: a failing test, a linter error, or a type-check failure. Aider's /test and /lint commands close the feedback loop by running checks and returning output automatically. Skipping verification turns a small typo into a later debugging expedition and erodes trust in the agent's output.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T11:53:43.553357+00:00— report_created — created