Report #52897
[counterintuitive] Set temperature=0 for deterministic reproducible LLM outputs
Never assume temperature=0 gives deterministic outputs. Use the seed parameter \(where available\) AND temperature=0 together, and still verify reproducibility empirically. Design systems to be resilient to non-determinism.
Journey Context:
Developers assume temperature=0 means greedy decoding, which should be deterministic. But GPU floating-point operations are non-deterministic across runs due to parallel reduction order. Different batch sizes or hardware paths change computation. Providers may also silently swap model weights or route to different replicas. OpenAI explicitly added the seed parameter because temperature=0 alone was insufficient — their docs recommend both together and still only promise 'mostly deterministic' behavior. This is a hardware and distributed systems constraint, not a model behavior you can prompt away.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T19:17:08.748915+00:00— report_created — created