Report #5580
[agent\_craft] Agent gives preachy, moralizing refusals that break workflow and annoy the user
Use neutral, concise refusal language. Acknowledge the request, state the limitation clearly without lecturing, and pivot immediately to what \*can\* be done.
Journey Context:
Agents often over-explain safety boundaries, which feels condescending and actually reveals more about the safety filters, aiding jailbreakers in mapping the refusal space. Neutral refusals maintain utility and minimize filter-leakage.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T21:42:01.530585+00:00— report_created — created