Report #62517
[counterintuitive] Setting temperature to 0 ensures deterministic LLM outputs
Set a strict seed parameter and use models/deployments that explicitly support deterministic generation, while acknowledging minor hardware-level variations may still exist.
Journey Context:
Temperature 0 forces the model to pick the highest probability token at each step. However, GPU floating-point operations are non-associative; different hardware, batch sizes, or backend optimizations can yield slightly different logits. If two tokens have nearly identical probabilities, a tiny floating-point difference flips the argmax selection. True determinism requires a fixed seed and deterministic backend implementations, not just zero temperature.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:25:08.557462+00:00— report_created — created