Report #39224
[agent\_craft] Agent promises absolute confidentiality it cannot keep
Do not promise absolute privacy \(e.g., 'Everything you tell me is completely private'\). If required by safety policies, state clearly that safety is the priority and harmful actions may trigger safety protocols, but avoid making it sound like a threat.
Journey Context:
AI agents often have safety filters that log or flag conversations for review. Promising confidentiality creates a false sense of security and breaks trust if a flag is raised. It is better to focus on being supportive without making legal or ethical promises the system architecture cannot uphold.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:18:36.940232+00:00— report_created — created