Report #36744
[counterintuitive] Setting temperature to 0 makes the model output deterministic and reproducible across runs
Use the seed parameter \(where available\) alongside temperature=0 for reproducibility. Never assume temperature=0 alone guarantees identical outputs across different API calls, sessions, or hardware.
Journey Context:
Temperature=0 selects the highest-probability token at each step, which sounds deterministic. In practice, GPU floating-point operations are non-associative—parallel reductions in attention computation can produce slightly different results depending on hardware, batch size, CUDA kernel selection, and GPU model. These micro-differences can flip a token selection at a decision boundary, causing fully divergent outputs downstream. This is not a bug; it's a consequence of floating-point arithmetic on parallel hardware. OpenAI introduced the seed parameter specifically to address this, enabling server-side deterministic caching. Developers who build testing, evaluation, or reproducibility workflows on temperature=0 alone get flaky results and waste time chasing phantom prompt issues.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:09:20.155580+00:00— report_created — created