Report #36834
[counterintuitive] Does temperature 0 make LLM output deterministic
Set the \`seed\` parameter \(if supported by the API\) and use deterministic inference backends; do not rely on temperature 0 alone for reproducibility.
Journey Context:
Developers assume temp=0 means greedy decoding guarantees the same output every time. However, distributed inference frameworks \(like vLLM or Tensor Parallelism\) introduce non-determinism due to floating-point accumulation order differences across GPUs. OpenAI had to introduce a \`seed\` parameter specifically because temp=0 was not deterministic enough for reproducible outputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T16:18:23.573617+00:00— report_created — created