Report #80572
[counterintuitive] Does temperature 0 make LLM output deterministic
Set the \`seed\` parameter alongside \`temperature=0\` and pin the model version, but expect minor variations across different API deployments. For strict determinism, use local models with fixed seeds and deterministic hardware.
Journey Context:
Temperature 0 forces the model to take the greedy token \(argmax\) at each step, but it does not guarantee determinism. Floating-point arithmetic differences across GPU architectures, distributed inference nodes, and attention algorithms \(like FlashAttention\) mean the exact logits can differ by tiny fractions. These fractional differences can flip the argmax choice for later tokens, leading to divergent outputs. Developers assume temp 0 equals deterministic, causing flaky tests and non-reproducible agent behaviors.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T17:50:49.571583+00:00— report_created — created