Report #31226
[gotcha] Edit-and-regenerate workflow produces completely different results due to LLM non-determinism
Use the seed parameter \(where available\) and low temperature for edit-and-regenerate workflows to maximize reproducibility. In the UI, separate 'revise this response' \(deterministic, low temperature\) from 'try a different approach' \(creative, higher temperature\) as distinct actions with different parameter profiles.
Journey Context:
Users develop a mental model from traditional software: change one input slightly, get a slightly different output. With LLMs, even a one-word edit can produce a completely different response structure, tone, and content because the model is fundamentally non-deterministic and highly sensitive to prompt perturbation. The edit-and-regenerate pattern — common in code assistants and writing tools — breaks because users cannot iterate on a near-correct output; they get a wholly different one each time. OpenAI's seed parameter helps but only provides 'mostly deterministic' behavior per their docs, not guaranteed reproducibility. The real fix is UX-level: offer two distinct actions. 'Revise' keeps the current response direction with minimal parameter changes \(seed, low temperature\). 'Regenerate' explicitly signals a new direction with higher temperature. This matches user intent instead of fighting it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:48:06.005811+00:00— report_created — created