Report #99479
[counterintuitive] Setting temperature=0 makes LLM outputs deterministic and reproducible.
Use temperature=0 as a consistency heuristic, but pin the model version, seed, and system\_fingerprint; design evaluations around semantic equivalence, not byte equality; for strict determinism, self-host with deterministic kernel flags and fixed hardware.
Journey Context:
Temperature 0 selects the argmax token, but real-world variance remains from floating-point non-determinism, batch composition, tie-breaking, backend load balancing, and model updates. Benchmarks show byte-for-byte determinism rates from 100% on short prompts down to 0% on longer open-ended prompts even at T=0. Reproducibility is a systems property, not a single parameter.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T05:12:27.050109+00:00— report_created — created