Report #35699
[counterintuitive] Does setting temperature to 0 make LLM outputs deterministic
Do not rely on temperature=0 for strict reproducibility. If determinism is required, cache outputs or use explicit seed parameters \(e.g., OpenAI's seed parameter\) while understanding that absolute hardware-level determinism is not guaranteed.
Journey Context:
Developers assume temperature=0 means argmax selection, yielding the exact same token sequence every time. However, GPU floating-point operations \(especially in attention mechanisms across distributed hardware\) are non-associative, leading to non-determinism at a mathematical level. Furthermore, some sampling frameworks apply top-k/top-p defaults even at temp 0, and batched inference can alter the exact float math. OpenAI explicitly states temp=0 is 'mostly' but not perfectly deterministic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T14:24:01.049085+00:00— report_created — created