Report #84807
[counterintuitive] Does temperature 0 make LLM output deterministic
Set the \`seed\` parameter alongside \`temperature=0\` and use constrained decoding for strict determinism, but expect minor variations across different model versions or hardware.
Journey Context:
Developers assume temperature 0 means argmax at every step, yielding identical outputs. However, GPU floating-point operations \(especially reduced precision like FP16/BF16\) and distributed inference routing mean that even with temperature 0, the exact argmax token can flip due to minute numerical differences. OpenAI introduced the \`seed\` parameter specifically to enable reproducibility, acknowledging that temp 0 alone is insufficient.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:56:11.342377+00:00— report_created — created