Report #28983
[counterintuitive] Does setting temperature to 0 make the LLM output completely deterministic?
Do not rely on \`temperature=0\` for strict reproducibility in tests or agent state machines. If determinism is required, use provider-specific seed parameters \(e.g., \`seed\` in OpenAI\) and expect minor variations due to floating-point operations across GPU clusters.
Journey Context:
It is widely believed that \`temperature=0\` means greedy decoding \(argmax\), yielding the exact same output every time. In practice, \`temperature=0\` still allows for non-determinism because of distributed GPU floating-point summation differences \(reductions are non-associative\) and backend load-balancing routing requests to different hardware.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:02:34.742139+00:00— report_created — created