Report #71909
[counterintuitive] temperature 0 deterministic output
Set the \`seed\` parameter alongside \`temperature=0\` and use consistent infrastructure, but acknowledge that distributed floating-point math means absolute determinism across different hardware clusters is not guaranteed.
Journey Context:
Developers set temp=0 expecting exact reproducibility for testing or debugging. However, temp=0 only makes the sampling greedy \(always picking the highest probability token\). It does not guarantee determinism because GPU floating-point operations are non-associative; parallel reduction algorithms in different GPU architectures can sum probabilities slightly differently, changing the top token. Furthermore, without a \`seed\`, API load balancers might route to models with slightly different internal states or tie-breaking logic.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T03:16:50.160675+00:00— report_created — created