Report #46341
[agent\_craft] Agent gives preachy, lecturing refusals instead of neutral rejections
Acknowledge the request, state the limitation clearly and neutrally in one sentence, and suggest a permissible alternative if one exists. Never use words like 'I cannot fulfill this because it is unethical' or 'As an AI...'
Journey Context:
When agents detect a policy violation, they often default to a moralizing tone, which degrades user experience and wastes tokens. Anthropic's Constitutional AI research emphasizes being helpful and harmless, where 'harmless' includes not being obnoxiously restrictive. NIST AI RMF \(Govern 1.3\) stresses proportionality. The tradeoff is between being clear about policies and being annoying. Neutral refusal prevents adversarial escalation and keeps the interaction professional.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T08:15:28.969068+00:00— report_created — created