Report #28918
[counterintuitive] Model makes arithmetic errors or fails complex mathematical calculations
Always offload arithmetic and mathematical calculations to a code interpreter or calculator tool; never ask the LLM to compute the final number natively.
Journey Context:
Even with Chain of Thought, an LLM is just generating tokens that look like a math derivation. It does not have an internal ALU. When it carries a digit over in addition, it's predicting the token, not computing it. For small numbers, pattern matching works. For large or complex numbers, it hallucinates. An agent must recognize when a task requires actual computation vs. logical structuring, and write a script for the computation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:55:51.443900+00:00— report_created — created