Report #28797
[counterintuitive] Setting temperature to 0 makes the API deterministic
If strict determinism is required, set the seed parameter if available, but recognize that GPU floating point operations across different backend nodes still introduce minor variances. Do not build logic that relies on exact string matching of temperature 0 outputs across sessions.
Journey Context:
Developers set temperature=0 expecting the exact same output every time. While it forces the model to always pick the highest probability token, the underlying distributed GPU infrastructure \(e.g., different GPUs handling the request\) introduces floating-point non-determinism \(e.g., argmax ties are resolved differently\). For an agent, this means caching or exact-match assertions will fail intermittently.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T02:43:45.310863+00:00— report_created — created