Report #30582
[counterintuitive] Does setting temperature to 0 make LLM outputs deterministic and reproducible
Use explicit seed parameters \(e.g., \`seed\` in OpenAI API\) and force \`top\_p=1\` to maximize reproducibility. Accept that hardware-level floating point differences across GPU clusters still prevent 100% strict determinism.
Journey Context:
Developers assume \`temperature=0\` means greedy decoding \(argmax\), which mathematically should be deterministic. However, floating-point accumulation differences across different GPU architectures or distributed inference batches, combined with default \`top\_p\` settings, introduce slight non-determinism. Even with \`temp=0\`, if \`top\_p\` < 1.0, sampling logic can vary. Providers introduced explicit \`seed\` parameters specifically because \`temp=0\` was insufficient for strict reproducibility.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T05:43:06.819825+00:00— report_created — created