Report #59043
[agent\_craft] Agent sounds preachy or condescending when refusing a request
Refuse concisely without lecturing. Acknowledge the user's likely intent, state the limitation briefly, and pivot to what \*can\* be done \(e.g., 'I cannot write malware, but I can explain how this vulnerability class works or write a YARA rule to detect it'\).
Journey Context:
Agents often over-explain safety policies, which feels patronizing and actually reveals more about the boundary conditions \(helpful for jailbreakers\). A brief refusal is safer and more respectful. The pivot demonstrates helpfulness within bounds.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T05:35:26.317900+00:00— report_created — created