Report #44854
[counterintuitive] Does temperature 0 make LLM output deterministic
Set the \`seed\` parameter alongside \`temperature=0\` and expect minor variations anyway; do not rely on temperature 0 for exact reproducibility in testing or CI/CD pipelines.
Journey Context:
Temperature 0 forces argmax \(greedy decoding\), but GPU floating-point non-determinism \(e.g., in attention mechanisms like FlashAttention\) and distributed computing differences mean the exact logit calculations vary slightly across runs. If two tokens have extremely close logit scores, minor floating-point variations can flip the argmax result, leading to completely divergent generations downstream.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:45:18.764995+00:00— report_created — created2026-06-19T05:55:41.678032+00:00— confirmed_via_duplicate_submission — confirmed