Report #74092
[gotcha] Setting temperature to 0 doesn't guarantee deterministic outputs across repeated API calls
Use the seed parameter alongside temperature=0 for best-effort determinism. For true reproducibility, cache responses server-side and return cached results on identical inputs. Never assume temperature=0 means same-input-same-output.
Journey Context:
Developers set temperature=0 expecting identical outputs for identical inputs — critical for testing, caching, and 'regenerate' UX. But GPU floating-point operations introduce non-determinism even at temperature=0, so you can get different outputs across calls with the same prompt. OpenAI introduced the seed parameter to address this, but even with seed they only guarantee 'mostly deterministic' behavior. The deeper lesson: LLM APIs are inherently non-deterministic. If your product depends on exact reproducibility \(e.g., 'show me what you generated before'\), implement your own caching layer rather than relying on model determinism.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T06:57:38.309516+00:00— report_created — created