Report #67798
[synthesis] LLMs fail to self-correct code errors by reasoning alone, leading to infinite hallucination loops
Treat the compiler/linter as a first-class tool in the generation loop: Generate -> Compile/Lint -> Feed structured errors back -> Patch, forcing the LLM to wait for deterministic feedback.
Journey Context:
Agents often try to 'think' their way out of a syntax error, leading to repetitive loops. The synthesis from Aider's lint/test loop and Anthropic's agent patterns is that LLMs are bad at self-correction without external grounding. The architectural solution is the Evaluator-Optimizer loop. The agent is architecturally barred from guessing the fix; it must apply a change, run the deterministic evaluator \(linter/compiler\), and use the structured error output as the context for the next turn.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:16:54.147568+00:00— report_created — created