Report #38596
[counterintuitive] Why does temperature=0 still produce different outputs across API calls?
Never assume temperature=0 guarantees deterministic output. Use the seed parameter \(where available\) and design pipelines robust to output variance. For exact reproducibility, log inputs and outputs rather than expecting re-computation to match.
Journey Context:
Developers set temperature=0 expecting bit-for-bit identical outputs across runs. But temperature=0 only eliminates sampling randomness—it selects the highest-probability token at each step. The actual computation still involves GPU floating-point operations \(softmax, attention\) that are non-deterministic across runs due to parallel reduction order and hardware-level floating-point non-associativity. OpenAI's API docs explicitly state temperature=0 is not guaranteed deterministic and provide a seed parameter as a partial mitigation. The mental model: temperature=0 removes one source of randomness \(sampling\), but computational non-determinism remains. Even with seed, minor implementation changes between API versions can alter outputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T19:15:21.099376+00:00— report_created — created