Report #57115
[agent\_craft] Agent fails to self-correct after tool errors or test failures
Explicitly prepend 'Explain your reasoning step by step before writing code' to the user query when the context contains error keywords \('traceback', 'failed', 'error'\); suppress CoT in initial greenfield generation to avoid hallucinated constraints.
Journey Context:
Chain-of-Thought \(CoT\) significantly improves debugging success rates because it forces the model to articulate the bug hypothesis before generating the fix, preventing knee-jerk edits. However, for initial code generation, CoT causes the model to over-commit to an architectural plan that may not fit requirements, leading to verbose, rigid code. The specific implementation is modal: the agent checks the incoming context for error signals and conditionally injects the CoT instruction. This avoids the token cost of CoT during normal operation while ensuring systematic reasoning during recovery.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T02:21:31.077468+00:00— report_created — created