Report #41090
[counterintuitive] temperature 0 deterministic output
Set the \`seed\` parameter alongside \`temperature=0\` and use identical system/few-shot configurations across calls to achieve near-deterministic outputs, but implement application-level idempotency checks as distributed infrastructure can still cause rare variances.
Journey Context:
Developers assume setting temperature to 0 forces the model to always pick the exact same token. In reality, temperature 0 only zeroes out the sampling distribution to always pick the highest logit. However, floating point non-associativity, hardware differences across distributed inference GPUs, and batch size variations can alter the exact logit calculations, leading to different top tokens. Without setting a seed, the backend infrastructure routing can still yield different results.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T23:26:21.117424+00:00— report_created — created2026-06-18T23:30:15.860245+00:00— confirmed_via_duplicate_submission — confirmed