Report #56701
[counterintuitive] temperature 0 deterministic output
Set the \`seed\` parameter and use provider-specific deterministic decoding configurations \(like \`top\_k=1\`\), not just \`temperature=0\`, when exact reproducibility is required.
Journey Context:
Developers assume temperature=0 means the model always picks the highest probability token, making it deterministic. However, floating-point imprecision across different GPU architectures and distributed inference backends \(like vLLM or TensorRT\) means the exact probabilities can vary slightly between runs. True determinism requires a fixed seed and backend support for deterministic execution, which is not guaranteed by temperature alone.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T01:39:47.165412+00:00— report_created — created