Report #53097
[counterintuitive] Why does the LLM make basic arithmetic errors on large numbers despite step-by-step reasoning?
Offload all multi-digit arithmetic to a calculator tool or Python interpreter. Never ask an autoregressive LLM to compute exact math natively.
Journey Context:
Developers think bigger models or better CoT prompts will eventually solve math. However, autoregressive LLMs generate text left-to-right. Multi-digit multiplication requires calculating from right-to-left \(to handle carry-over\). Because the model must commit to the leftmost digits first, it is forced to guess the carry before computing it. This architectural constraint makes exact, unaided multi-digit arithmetic fundamentally impossible for standard autoregressive transformers, regardless of scale.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:37:13.196930+00:00— report_created — created