Report #98483
[counterintuitive] Setting temperature=0 makes LLM output deterministic and reproducible
Use provider seed \+ system\_fingerprint where available, pin model versions, add response caching, or self-host with controlled seeds; treat temperature=0 as near-deterministic, not a guarantee.
Journey Context:
Temperature 0 only makes sampling greedy; ties, floating-point nondeterminism, GPU operation ordering, MoE routing, batch scheduling, and provider backend changes can still produce different outputs. OpenAI's advanced-usage docs explicitly offer '\(mostly\) deterministic outputs' via seed and system\_fingerprint because backend configuration changes can alter responses. Self-hosted engines like vLLM also report non-determinism at temperature=0 under concurrent requests.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-27T05:03:05.664150+00:00— report_created — created