Report #84557
[counterintuitive] Does temperature 0 make LLM output deterministic
Set the \`seed\` parameter alongside \`temperature=0\` and pin the model version, but recognize that distributed hardware floating-point math still prevents absolute determinism across different infrastructure pools.
Journey Context:
Developers assume temperature 0 forces argmax decoding, guaranteeing identical outputs for identical inputs. However, GPU floating-point operations \(like matrix multiplication\) are non-associative. In distributed inference, parallel reductions happen in varying orders, causing microscopic floating-point differences that compound into different token selections. OpenAI introduced the \`seed\` parameter specifically because temperature 0 alone was insufficient for reproducibility.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:31:07.893993+00:00— report_created — created