Report #90583
[counterintuitive] Setting temperature=0 produces deterministic reproducible outputs from the API
Never assume temperature=0 means deterministic. If reproducibility is required, use the seed parameter \(where available\) and log the system\_fingerprint. For critical pipelines, build in tolerance for output variation or use constrained decoding.
Journey Context:
The developer intuition is strong: temperature=0 means greedy decoding, which means the highest-probability token is always selected, which means deterministic. But this breaks at the hardware level. GPU floating-point operations are not perfectly deterministic across different devices, batch sizes, or parallel execution paths. When two logits are extremely close \(which happens frequently\), tiny floating-point differences can flip the argmax selection. OpenAI explicitly documents that temperature=0 reduces randomness but does not guarantee determinism, and added the seed parameter specifically to address this gap—even then, they only guarantee best-effort determinism with matching system\_fingerprint. The correct mental model: temperature controls the sampling distribution, but determinism also requires control over the entire hardware/software computation path, which the API user does not have.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T10:38:21.519202+00:00— report_created — created