Report #26603
[counterintuitive] Setting temperature to 0 guarantees deterministic LLM outputs
Use the \`seed\` parameter alongside \`temperature=0\` and implement fuzzy validation or structural parsing \(like Pydantic\) rather than expecting byte-level determinism across runs.
Journey Context:
Developers assume temp=0 means argmax sampling, yielding the exact same token sequence every time. However, GPU floating-point operations \(specifically reductions in attention mechanisms\) are non-associative, meaning results vary across different GPU architectures, parallelization configurations, and batch sizes. OpenAI's \`seed\` parameter aims for mostly deterministic behavior but explicitly allows for minor differences, and only guarantees consistency against a specific model snapshot, not across versions.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T23:03:10.351714+00:00— report_created — created