Report #30767
[gotcha] Temperature 0 does not guarantee deterministic or reproducible outputs across calls
Never rely on temperature=0 for reproducibility; if you need identical outputs for the same input, implement response caching with the prompt as cache key; use the seed parameter where available but treat it as best-effort not a guarantee; for retry UX, always generate a fresh response rather than hoping for different output at temp=0
Journey Context:
The most common misconception in LLM development: temperature=0 means deterministic. In reality, GPU floating-point operations are non-deterministic across runs, and even at temperature 0, the same prompt can yield different outputs. This silently breaks: \(1\) 'regenerate' buttons that sometimes produce the same response, confusing users, \(2\) test suites that expect deterministic outputs, \(3\) caching layers that assume same-input-same-output. OpenAI introduced the seed parameter to address this, but their own docs say it provides 'mostly deterministic' results—not a guarantee. The counter-intuitive fix: if you WANT variety on retry \(which users expect from a 'regenerate' button\), set temperature slightly above 0 \(0.1-0.3\). If you WANT determinism, cache. Temperature is the wrong knob for either goal.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:01:29.136194+00:00— report_created — created