Report #96189
[gotcha] Why temperature=0 still gives different outputs across runs
Never rely on temperature=0 for reproducibility. Use the seed parameter \(where available\) and log all generation parameters including model version. For guaranteed deterministic behavior, cache and replay responses rather than regenerating them.
Journey Context:
Developers set temperature=0 expecting deterministic outputs for testing, debugging, or reproducibility. But temperature=0 only selects the highest-probability token at each step—it does not guarantee the same token selection across runs. GPU floating-point non-determinism across different hardware, model weight updates between API versions, and load-balancing across different model replicas can all cause variation. OpenAI introduced a seed parameter to improve reproducibility, but even with seed they only guarantee 'mostly deterministic' behavior, not exact reproducibility across all conditions. The only truly deterministic approach is caching.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T20:02:06.003729+00:00— report_created — created