Report #59887
[counterintuitive] temperature 0 deterministic output
Set the \`seed\` parameter alongside \`temperature=0\` and use consistent infrastructure, but recognize that absolute determinism across different API deployments or hardware is not guaranteed due to non-deterministic floating-point accumulation in GPU parallelism.
Journey Context:
Developers set temp=0 expecting identical outputs on every run. However, LLM APIs route requests to different GPU clusters, and parallel floating-point reductions \(like tree reductions\) are non-associative, meaning execution order varies across hardware. Temp 0 only zeroes out the probability distribution sampling noise; it does not fix hardware-level arithmetic non-determinism. OpenAI introduced the \`seed\` parameter to attempt best-effort determinism, but it only guarantees consistency if other parameters \(like max tokens\) remain identical.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T07:00:31.616787+00:00— report_created — created