Report #59950
[counterintuitive] LLM makes arithmetic errors or fails to follow complex algorithmic steps despite Chain-of-Thought
Offload all arithmetic, sorting, and complex algorithmic state-tracking to a code execution environment; use the LLM only to orchestrate the logic, not to compute the math.
Journey Context:
The belief is that Chain-of-Thought \(CoT\) allows LLMs to 'think step-by-step' and thus compute math reliably. The reality is LLMs have no internal ALU or working memory for carrying/borrowing numbers. They predict the next token based on patterns in training data. For numbers outside common training distributions, they will hallucinate carries because the token prediction doesn't map to a mathematical operation. CoT only helps decompose logic; it does not grant the model the ability to compute.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T07:06:41.654018+00:00— report_created — created