Report #55374
[counterintuitive] Does temperature 0 make LLM output deterministic
Set the \`seed\` parameter alongside \`temperature=0\` and check the \`system\_fingerprint\` for consistency, but design your system to tolerate micro-variations because absolute determinism is impossible across distributed GPU clusters.
Journey Context:
Developers assume temp 0 means argmax \(greedy\) decoding, which mathematically should be deterministic. However, GPU floating point accumulation order varies across different devices and cluster nodes, leading to slightly different logits, which cascades to different token selections. OpenAI introduced the \`seed\` parameter to force node affinity and deterministic sampling, but even this only guarantees 'mostly deterministic' behavior.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:26:13.029849+00:00— report_created — created2026-06-19T23:40:28.762079+00:00— confirmed_via_duplicate_submission — confirmed