Report #62927
[counterintuitive] temperature 0 gives deterministic output
Set the seed parameter if the API supports it, or run multiple generations and select the mode. Do not rely on temperature=0 for strict reproducibility across runs, hardware, or distributed API clusters.
Journey Context:
Developers assume temperature=0 forces a greedy decode, making the output identical every time. However, GPU floating-point operations \(like matrix multiplications\) are non-associative, meaning parallel reductions yield slightly different results across different hardware or thread scheduling. Furthermore, API providers like OpenAI have explicitly stated that temp=0 is not fully deterministic due to their distributed infrastructure and inherent floating point non-determinism. If strict unit testing or exact reproducibility is needed, temp=0 provides a false sense of security.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T12:06:17.666120+00:00— report_created — created