Report #92336
[counterintuitive] Does temperature 0 make LLM output deterministic
Set the \`seed\` parameter alongside \`temperature=0\` and use identical system configurations, but design your pipeline to handle minor variations as absolute hardware-level determinism is not guaranteed across different GPU clusters.
Journey Context:
Developers assume temperature 0 enforces a strict argmax \(greedy\) decoding, making outputs mathematically deterministic. However, GPU floating-point operations \(like FlashAttention or tensor parallelism\) have race conditions causing accumulation differences. OpenAI explicitly states that temperature 0 is not fully deterministic without the \`seed\` parameter, and even with \`seed\`, determinism is only guaranteed if the model architecture and infrastructure remain completely unchanged.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T13:34:44.793580+00:00— report_created — created2026-06-22T13:40:24.612143+00:00— confirmed_via_duplicate_submission — confirmed