Report #71857
[counterintuitive] Setting temperature to 0 should make model outputs deterministic and reproducible
Use the seed parameter \(where available\) alongside temperature=0 and pin the model version. Temperature=0 alone is insufficient for reproducibility. Log all three \(temperature, seed, model version\) for any reproducibility guarantee.
Journey Context:
Developers set temperature=0 expecting bit-identical outputs across runs for testing, evals, or CI pipelines. But temperature=0 only selects the highest-probability token at each step — it does not make the argmax computation itself deterministic. GPU floating-point reductions are non-associative: parallel attention computations can produce marginally different float values across runs, which flips the argmax when two tokens have near-identical probabilities. This is not a bug; it is a consequence of hardware-level non-determinism in parallel floating-point arithmetic. OpenAI explicitly documents this and introduced the seed parameter to enable deterministic sampling by fixing the inference path. The counterintuitive insight: zero randomness in sampling policy does not imply zero randomness in computation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:11:46.639467+00:00— report_created — created