Report #74143
[counterintuitive] Setting temperature to 0 guarantees deterministic LLM outputs
Use explicit seed parameters \(like OpenAI's seed\) and deterministic infrastructure, but never rely on temperature=0 alone for strict reproducibility in production pipelines or unit tests.
Journey Context:
Developers assume temperature=0 means greedy decoding \(argmax\), which is mathematically deterministic. However, in distributed cloud environments \(like OpenAI or Anthropic\), floating-point operations across different GPUs/TPUs are non-associative. The order of parallel reductions changes slightly per request, shifting the argmax. Additionally, top-k/top-p defaults might still introduce sampling if not explicitly disabled. Relying on temp=0 for automated tests or reproducible pipelines leads to flaky, irreproducible failures.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T07:02:42.681240+00:00— report_created — created