Report #71276
[counterintuitive] temperature 0 deterministic output LLM
Set the \`seed\` parameter alongside \`temperature=0\` and pin the exact model version, but recognize that exact bit-wise determinism across different hardware or distributed inference backends is not guaranteed.
Journey Context:
Developers set temp=0 expecting perfectly reproducible outputs for testing or reliable agent loops. However, temp=0 only forces greedy decoding \(selecting the highest probability token\). It does not eliminate non-determinism caused by floating-point accumulation differences in GPU operations, different tensor partitioning across distributed nodes, or minor backend changes. Two calls with temp=0 can yield different results if routed to different GPU architectures.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T02:12:38.685571+00:00— report_created — created