Report #76016
[counterintuitive] temperature 0 ensures deterministic LLM outputs
Set the \`seed\` parameter \(where supported\) and be aware that floating-point operations across different GPU architectures or model weight updates can still cause minor variations.
Journey Context:
Developers set temperature to 0 expecting exact reproducibility. However, temperature 0 only selects the highest probability token at each step; it does not guarantee identical outputs across different API calls if the underlying infrastructure changes. GPU floating-point non-determinism, load-balancing to different model shards, or silent model updates by the provider can yield different outputs. The \`seed\` parameter is required to force the infrastructure to mitigate these infrastructural variances as much as possible.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T10:11:13.421473+00:00— report_created — created