Report #97511
[counterintuitive] Setting temperature to 0 makes an LLM fully deterministic
Design for approximate, not absolute, determinism. Use seed and system\_fingerprint, keep model versions pinned, and add idempotency or output validation for workflows that require consistency.
Journey Context:
Temperature 0 switches sampling to greedy decoding, which removes intentional randomness but not the other sources of variation in real inference. OpenAI's reproducible-outputs cookbook explicitly warns that 'determinism is not guaranteed' even with seed and temperature 0, citing 'the inherent non-determinism of our models.' Floating-point non-associativity, parallel GPU reductions, MoE routing, batching, and backend updates can all slightly shift logits, flip near-ties, and cause diverging completions. For tests and audits, pin the model, seed, and system\_fingerprint; for production, never rely on byte-identical outputs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-25T05:14:51.266290+00:00— report_created — created