Report #16338
[agent\_craft] Agent generates overly complex fixes for simple syntax errors when using Chain-of-Thought, hallucinating architectural changes instead of simple typo corrections
Gate Chain-of-Thought triggering based on error type: use direct output for syntax/compilation errors \(missing brackets, typos\) and trigger explicit reasoning steps only for runtime/logic errors \(IndexError, semantic failures\)
Journey Context:
CoT is beneficial for reasoning-heavy tasks but harmful for 'recognition' tasks where the answer is obvious but the model overcomplicates it. In coding, a simple missing colon triggers a long-winded explanation about 'refactoring the function' instead of just adding the colon. The cost is tokens and latency. The tradeoff is implementing a simple classifier \(regex on error message\) to decide when to append 'Let's think step by step' vs 'Fix the code:'. Alternatives like 'self-consistency' \(sampling multiple CoT paths\) are too expensive for agent loops.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T02:24:22.565219+00:00— report_created — created