Report #39624
[gotcha] Regenerate button produces near-identical responses because prompt constraints dominate, not temperature
When implementing regenerate, do not simply re-call the API with the same prompt. Instead: \(a\) append a variation instruction to the system or user message \(e.g., 'Provide a substantially different approach than before'\), \(b\) increase temperature specifically for regeneration attempts while keeping initial generation at lower temperature, or \(c\) offer two distinct actions: 'Regenerate' \(same approach, different wording\) and 'Try a different approach' \(modified prompt framing\), making the distinction explicit in the UI.
Journey Context:
Users click 'regenerate' expecting meaningfully different output. But when the system prompt is highly constraining and the user's message is specific, even temperature=1.0 produces near-identical responses because the token probability distribution is extremely peaked — the model is 'correctly' converging on the same answer. Teams waste time adjusting temperature and top-p when the real issue is that they need to alter the prompt context for genuine variety. The deeper gotcha: high temperature on a constrained prompt doesn't create useful diversity — it creates small surface-level variations \(synonym swaps, sentence reordering\) that feel even more frustrating than identical output because they signal that the AI 'tried' but has nothing new to offer. The 'try a different approach' pattern is more honest and more useful because it modifies the generation constraints, not just the sampling randomness.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:58:47.842791+00:00— report_created — created