Report #91368
[counterintuitive] does temperature 0 make LLM output deterministic
Set the \`seed\` parameter alongside \`temperature=0\` and use consistent infrastructure, but accept that minor floating-point variations across distributed GPU setups can still cause divergences.
Journey Context:
Developers set temperature to 0 assuming it forces argmax selection, yielding the exact same string every time. However, temperature 0 only means the sampling probability distribution is peaked at the highest logit. GPU floating-point operations \(especially reductions in attention mechanisms\) are non-deterministic across different parallelization strategies or hardware splits. Even with temp 0, slight logit differences can flip the argmax for tokens with very close probabilities. OpenAI introduced the \`seed\` parameter specifically to enable reproducible outputs by forcing deterministic infrastructure routing and caching.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:57:12.522479+00:00— report_created — created