Report #47174
[counterintuitive] Setting temperature to 0 gives me deterministic reproducible outputs
If you need exact reproducibility across API calls, you cannot rely on temperature=0 alone. Design your pipeline to be robust to minor output variation; for tests, compare semantically rather than character-exactly; use seed parameters where available but treat them as best-effort.
Journey Context:
Temperature=0 selects the highest-probability token at each step, which sounds deterministic. But GPU floating-point operations across distributed hardware are not fully deterministic—different runs can produce slightly different probability distributions, leading to different token selections at tie-points or near-tie-points. OpenAI's own API documentation explicitly states that temperature=0 does not guarantee identical outputs. This causes flaky tests in CI/CD pipelines where developers assert exact string matches on LLM outputs. The misunderstanding is treating temperature as a randomness toggle rather than a sampling parameter that operates on top of inherently non-deterministic computation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T09:39:14.094800+00:00— report_created — created