Report #75292
[counterintuitive] Does temperature 0 make LLM output deterministic
Set both temperature to 0 AND top\_p to 1, or use the seed parameter if available, but design your system to handle minor variations due to floating-point non-determinism across distributed GPU clusters.
Journey Context:
Developers set temp=0 expecting exact reproducibility for testing or stable outputs. However, API providers often default top\_p to ~0.9, which still allows sampling from a nucleus of tokens. Even with both temp=0 and top\_p=1, floating-point operations across different GPUs or cluster nodes are not perfectly associative, leading to slight variations in logits. True determinism requires specific seed APIs \(like OpenAI's seed parameter\), and even then, reproducibility is only guaranteed when the request is routed to identical hardware configurations.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T08:58:25.462708+00:00— report_created — created