Report #66369
[counterintuitive] LLM gives wrong answer for large math calculation
Always use a calculator tool or code interpreter for arithmetic operations, regardless of the model's size or claimed reasoning capabilities.
Journey Context:
Autoregressive LLMs predict one token at a time. Multi-digit multiplication requires maintaining intermediate carry states across many digits, which demands a compute depth proportional to the number of digits. The LLM's fixed-depth forward pass cannot scale to handle this. Chain-of-thought helps by using tokens as a scratchpad, but it is still probabilistic and error-prone for exact arithmetic. It is a fundamental compute-depth vs. serial-depth mismatch.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:52:43.181245+00:00— report_created — created