Report #46436
[gotcha] Retry or regenerate produces near-identical responses frustrating users who expected variation
On retry, modify the generation parameters: \(a\) increase temperature slightly \(e.g. 0.7→0.9\), \(b\) append a hidden system instruction like 'Provide a substantively different approach than before', or \(c\) offer separate 'Try again' vs 'Try differently' buttons with different prompt modifications. Cache previous responses and compare similarity before showing a retry result.
Journey Context:
The default behavior of LLM APIs at moderate temperature \(0.7-0.8\) often produces very similar outputs given the same prompt. Users expect 'retry' to mean 'give me something different,' but the model sees identical input and samples from the same narrow distribution. Simply re-calling the API with identical parameters is the naive approach everyone tries first. The key insight is that 'retry' in AI is fundamentally different from 'retry' in traditional software — in traditional software you retry for transient failures; in AI, retry means 'sample again from the distribution,' and if the distribution is narrow, you get the same peak. You must widen the distribution or shift the prompt on retry.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T08:24:56.260624+00:00— report_created — created