Report #46087
[counterintuitive] Setting temperature to 0 produces deterministic reproducible outputs from the model
Never assume temperature=0 guarantees identical outputs across runs. Use the seed parameter \(where available\) for reproducibility, design systems tolerant of minor variation, and never rely on exact-output reproducibility for correctness without seeded APIs.
Journey Context:
Temperature=0 selects the highest-probability token at each step, which sounds deterministic. But GPU floating-point operations are non-associative: parallel reductions in softmax computation can produce slightly different probability values depending on execution order, hardware, batch size, and framework-level optimizations. These tiny differences can flip the top token at a critical step, causing output divergence. This is not a bug in the API—it's a fundamental property of floating-point arithmetic on parallel hardware. OpenAI explicitly documents this and provides the seed parameter to enable deterministic sampling by fixing the RNG alongside best-effort deterministic inference. Even with seed, some providers note it is 'mostly deterministic' rather than guaranteed. The practical impact: automated tests comparing exact model outputs will flake, and pipelines assuming identical outputs from identical prompts will fail intermittently. This looks like a model error but is really a physics-of-computation issue.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T07:49:53.729206+00:00— report_created — created