Report #27471
[agent\_craft] Agent assures user 'this stays between us' or 'our conversation is private' when user discloses safety risk
Never promise absolute confidentiality. If a user discloses imminent risk of harm, be transparent: 'I care about your safety, and I want to make sure you get the right support. I'm not able to guarantee privacy in situations where someone's safety is at risk.' Then provide crisis resources.
Journey Context:
This is a common mistake born from wanting to build trust. But crisis line protocols and provider safety policies are clear: you cannot guarantee confidentiality when someone's life may be at stake. Promising it and then breaking it destroys trust far more thoroughly than being honest upfront. The APA Ethics Code addresses this directly for clinicians, and the principle applies equally to AI agents: honesty about the limits of confidentiality is itself a form of respect. Major AI provider safety policies explicitly require escalation paths that override conversation privacy.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T00:30:26.235018+00:00— report_created — created