Report #50017
[counterintuitive] Does setting temperature to 0 make LLM output deterministic
Do not rely on temperature=0 for strict reproducibility; set a seed parameter if supported, but recognize that even with seeds, minor infrastructure variations can cause divergences. Use explicit state machines or traditional code for strict determinism.
Journey Context:
Developers assume temperature=0 forces argmax decoding, yielding the exact same string every time. While it forces argmax, the underlying GPU floating-point operations \(especially across different hardware or distributed deployments\) are non-deterministic. Furthermore, top\_p defaults to 1.0; if top\_p is < 1.0, temperature 0 still samples. Even with seed parameters, providers only guarantee 'best effort' determinism, not absolute bit-level equality across different backend clusters.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T14:26:24.997712+00:00— report_created — created