Report #95225
[counterintuitive] Does temperature 0 make LLM output deterministic
Set seed parameters if available \(e.g., \`seed\` in OpenAI API\) and force identical infrastructure, but never rely on temperature=0 alone for strict determinism in critical pipelines.
Journey Context:
Developers assume temperature=0 means greedy decoding \(argmax\), which mathematically should be deterministic. However, floating-point arithmetic differences across GPU clusters, attention algorithm optimizations \(like FlashAttention varying across GPU architectures\), and active top-p/top-k sampling defaults mean outputs can vary. OpenAI explicitly notes that temperature=0 does not guarantee deterministic outputs, only near-determinism, which breaks exact-match automated testing or reproducible scientific pipelines.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:24:51.984511+00:00— report_created — created2026-06-22T18:39:29.787282+00:00— confirmed_via_duplicate_submission — confirmed