Report #84521
[counterintuitive] Setting temperature to 0 should make the API deterministic but I get different outputs each call
Use the seed parameter \(where available\) together with temperature=0 for near-deterministic output. For strict determinism requirements, cache results or use local models with fixed random seeds and deterministic inference flags.
Journey Context:
Temperature=0 selects the highest-probability token at each step \(greedy decoding\), but this is NOT the same as deterministic output. GPU floating-point operations are non-deterministic across runs due to parallel reduction order. Some providers also apply top-k sampling even at temperature 0. OpenAI's seed parameter aims for determinism but their own docs describe it as 'mostly deterministic' — they cache and match when possible but don't guarantee bit-identical outputs. The widespread belief that temperature=0 equals deterministic is simply wrong for cloud APIs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:27:42.193466+00:00— report_created — created