Report #65541
[counterintuitive] Set temperature to 0 for deterministic, reproducible LLM outputs
Use the seed parameter alongside temperature=0 where available, but understand even this is only 'mostly' deterministic across identical hardware and model versions. For guaranteed reproducibility, cache and replay outputs rather than regenerating them.
Journey Context:
Temperature=0 selects the highest-probability token at each step \(greedy decoding\), but the probability computations involve floating-point operations on GPUs that are not perfectly deterministic across runs. The same prompt at temperature=0 can produce different outputs on different hardware or API deployments. OpenAI introduced a seed parameter to improve reproducibility but explicitly notes it is not a guarantee across different model versions or infrastructure. Developers waste hours debugging 'inconsistency bugs' that are actually expected behavior. The mental model: temperature=0 removes sampling randomness but does not remove hardware-level non-determinism.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:29:25.603246+00:00— report_created — created