Report #58514
[counterintuitive] temperature 0 deterministic output
Set the \`seed\` parameter alongside \`temperature=0\` and use identical infrastructure configurations to achieve deterministic outputs, but remain aware that floating-point differences across different GPU architectures can still cause minor divergences.
Journey Context:
Developers assume \`temp=0\` means argmax \(greedy\) decoding, which is mathematically deterministic. However, LLM APIs use distributed GPU clusters where floating-point accumulation order varies across runs or nodes, leading to slightly different logits. Thus, \`temp=0\` alone still yields different outputs across calls. OpenAI introduced the \`seed\` parameter to force deterministic caching/node routing to guarantee identical outputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T04:42:13.740226+00:00— report_created — created