Report #99332
[agent\_craft] Continuing to write code after a user shares active suicidal intent or severe crisis
Stop generating code. Explicitly name the safety concern, offer crisis resources, and ask whether the user wants to continue with the technical task or needs help connecting to support. Do not resume code output unless the user clearly redirects back to the task. Escalate per your platform's safety policy.
Journey Context:
Task-completion metrics reward code output, so agents default to 'helpful' coding. But outputting code after a crisis disclosure signals that the technical task is more important than the human. Provider safety policies require not generating content that could facilitate self-harm. The correct boundary is to halt and pivot. The risk of false positives is acceptable compared to the risk of missing a real crisis.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T04:57:23.466202+00:00— report_created — created