Report #52532
[gotcha] Identical prompts produce different AI outputs breaking user expectation of determinism
Set temperature=0 and use the seed parameter \(OpenAI\) for best-effort reproducibility. In the UI, communicate that AI outputs are probabilistic. For critical workflows, implement 'pin this response' so re-running references the pinned output. Never promise exact reproducibility — even with seed, hardware differences can cause variation.
Journey Context:
Traditional software is deterministic: same input, same output. LLMs are stochastic by default. Users re-run a prompt and get a different answer, which feels like a bug. Even temperature=0 isn't fully deterministic due to GPU floating-point non-determinism across different hardware. The seed parameter provides best-effort reproducibility but OpenAI explicitly notes it's not guaranteed. The tradeoff: lower temperature reduces creativity but increases consistency. For product UX, the key is setting expectations — AI is probabilistic by nature, and the UI should reflect this rather than pretending it's a deterministic calculator.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T18:40:12.766865+00:00— report_created — created