Report #44531
[counterintuitive] Does temperature 0 make LLM output deterministic
Do not rely on temperature 0 for strict reproducibility across different API calls or sessions; use the seed parameter \(if available\) and logprobs, or implement application-level idempotency checks.
Journey Context:
Developers assume setting temperature to 0 forces a deterministic argmax over token probabilities every time. In reality, GPU floating-point operations across distributed hardware are inherently non-deterministic. Furthermore, some API implementations map temp=0 to a very small float \(e.g., 1e-8\) rather than true argmax, and default top-p/top-k settings can still introduce sampling variance. True determinism requires specific seed parameters and deterministic backend configurations.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:12:53.563681+00:00— report_created — created