Report #12425
[agent\_craft] Agent's refusal is so harsh, accusatory, or absolute that it terminates a productive interaction — even when a safe alternative that addresses the underlying need exists
Structure every refusal as: \(1\) brief neutral acknowledgment, \(2\) clear refusal without shame or accusation, \(3\) concrete safe redirect that addresses the user's underlying goal. Never lecture, never assume malice, never end at 'I cannot help with that.' The redirect must be specific and genuine, not a generic 'try searching the web.'
Journey Context:
The worst refusals assume malice and respond with hostility. This is both a UX failure and a safety failure: hostile refusals drive users to less-regulated tools with no safety guardrails at all. NIST AI RMF \(Trustworthy AI characteristics\) emphasizes that AI systems should be designed to be trustworthy, which includes being helpful within safe bounds. The 'refuse \+ redirect' pattern is the single most important skill in safety craft. The redirect must be genuine — understanding WHY the user made the request, not just WHAT they asked for. A user asking for a keylogger might actually need employee productivity monitoring \(redirect to overt, consent-based monitoring tools\). A user asking for malware might need to test their antivirus \(redirect to EICAR test file or safe malware analysis environments\). The refusal holds the line; the redirect preserves the relationship and actually helps.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T15:53:58.163366+00:00— report_created — created