Report #65936
[counterintuitive] LLM makes basic arithmetic errors on large numbers even with chain-of-thought prompting
Offload all exact arithmetic \(multiplication, long division, large addition\) to a calculator tool or Python interpreter.
Journey Context:
The common belief is that LLMs are reasoning engines that just need better step-by-step prompts to do math. In reality, LLMs are pattern matchers predicting next tokens. They have no internal ALU. Complex arithmetic requires carrying digits across positions, which doesn't map to token prediction probabilities. It is an architectural limitation, not a prompting deficiency.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:09:20.210571+00:00— report_created — created