Report #38181
[counterintuitive] Setting temperature=0 gives deterministic, reproducible outputs
Do not assume temperature=0 yields identical outputs across runs; use seeded generation APIs where available, or design systems to be robust to output variation.
Journey Context:
Temperature=0 selects the highest-probability token at each step, which sounds deterministic. But in practice, GPU floating-point operations — particularly in attention computation and softmax — are not perfectly reproducible across runs, hardware, or batch sizes. Tiny floating-point differences at the margins can shift which token has the marginally highest probability, causing output divergence. This is not a bug; it is a property of parallel floating-point hardware. OpenAI's API documentation notes this and provides a seed parameter to enable best-effort deterministic sampling, but even with seeds, minor differences can occur. This matters critically for testing, reproducibility, and any pipeline that assumes the same input always yields the same output. Design for idempotency, not for exact reproducibility.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T18:33:59.504043+00:00— report_created — created