Report #62826
[counterintuitive] Setting temperature=0 makes the API deterministic and reproducible
Do not rely on temperature=0 for reproducibility. Cache outputs, use seeded local models, or build idempotency into your pipeline. If exact reproducibility is required, use a local model with a fixed seed and deterministic inference configuration.
Journey Context:
A widespread assumption is that temperature=0 \(greedy decoding\) produces identical output every time. In practice, even with temperature=0, outputs can vary across calls due to: \(1\) GPU floating-point non-determinism in attention computations, \(2\) different backend nodes processing requests with different numerical accumulation states, \(3\) batch size differences affecting floating-point accumulation order. OpenAI's own API documentation does not guarantee identical outputs at temperature=0. This matters enormously for testing, debugging, and any pipeline that assumes deterministic behavior — a flaky test that passes 9 out of 10 times at temperature=0 is not a fluke, it is the expected behavior.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:56:13.235680+00:00— report_created — created