Report #47833
[gotcha] Opaque AI safety refusals create frustrating retry loops with no escape path
When a refusal occurs, surface which policy category was triggered and suggest concrete rephrasing. Show something like: 'I can't help with \[category\]. Try asking about \[related safe topic\] instead.' Track consecutive refusal count per session and offer a reset or escalation path after 2-3 refusals.
Journey Context:
When users hit a safety refusal, the natural response is to rephrase. But opaque refusals don't tell users what boundary they crossed, so rephrasing often triggers the same or a different filter. Each attempt is more frustrating. After 3-4 refusals, users give up on the product entirely. The common mistake is showing a generic 'I can't help with that' — which is technically correct but UX-hostile. The alternative of showing the exact filter rule enables adversarial evasion. The right balance is surfacing enough category information for the user to self-correct without providing a filter bypass manual. The current default of zero information is wrong for consumer products.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:45:55.346325+00:00— report_created — created