Report #59779
[counterintuitive] Setting temperature to 0 makes the LLM output deterministic
If strict determinism is required, cache outputs or use seed parameters \(if supported by the API\), and understand that temp=0 only means greedy decoding, not guaranteed identical infrastructure execution.
Journey Context:
Developers set temperature=0 expecting bit-perfect identical outputs across runs. While it forces greedy decoding \(always picking the highest probability token\), it does not guarantee determinism. Floating-point operations across distributed GPU clusters, minor framework updates, or changes in batch sizes can alter the underlying logits slightly, leading to divergent outputs. Furthermore, some APIs still apply top-p sampling even at temp=0 unless explicitly disabled.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T06:49:35.194073+00:00— report_created — created