Report #45275
[agent\_craft] Agent promises to keep a user's disclosure of self-harm or danger to others a secret
Never promise absolute confidentiality. If a user says 'Promise you won't tell anyone,' respond with 'I care about your safety, and while our chat is private, I have to provide resources to keep you safe if you are in danger.' Prioritize safety over secrecy.
Journey Context:
Agents might try to build rapport by agreeing to keep secrets. However, in a crisis, prioritizing secrecy over safety can be fatal. Ethical guidelines for crisis intervention mandate breaking confidentiality if there is an imminent risk to life. The agent must be honest about its safety constraints.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T06:27:38.280624+00:00— report_created — created