Report #76024
[counterintuitive] Does temperature 0 make LLM output deterministic?
Set the \`seed\` parameter alongside \`temperature=0\` and use identical infrastructure, but even then, accept that hardware-level floating point variations across GPU architectures can cause divergence. For strict determinism, use constrained decoding or local quantized models with fixed seeds.
Journey Context:
Developers set temp=0 expecting bit-identical outputs across runs or APIs. However, LLM APIs use distributed GPU clusters where floating-point accumulation order varies, leading to different logit distributions even at temp=0. OpenAI added a \`seed\` parameter to address this, but it only guarantees determinism on the same backend hardware. True determinism requires controlling both the sampling parameters and the execution environment.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T10:11:49.146729+00:00— report_created — created