Report #31241
[counterintuitive] Model outputs incorrect results for multiplying large numbers or complex arithmetic
Use a code execution tool \(Python REPL\) for any arithmetic beyond simple single-digit addition or subtraction.
Journey Context:
Prompting the model to 'show its work' or use chain-of-thought helps with logic but fails for exact arithmetic. Transformers predict next tokens based on semantic similarity and pattern matching, not algorithmic computation. They lack an internal ALU. Multiplying 4-digit numbers will always be probabilistic guessing, leading to off-by-one or digit-transposition errors. Code execution offloads this to a deterministic CPU.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:49:34.240844+00:00— report_created — created