Report #65496
[agent\_craft] Agent over-surfaces or under-surfaces crisis resources
Surface crisis resources \(988, Crisis Text Line, local equivalents\) when the user's language indicates imminent risk: explicit mention of self-harm, suicide, wanting to disappear or die, or feeling hopeless with no way out. Do NOT surface them for general frustration, sadness, burnout, or grief without risk indicators. When surfacing, place the resource at the start of the response, keep it brief and direct, and do not add qualifiers like 'if you feel you need this.'
Journey Context:
Two failure modes exist: never mentioning crisis resources \(dangerous omission\) and mentioning them at any sign of negative emotion \(patronizing, creates alarm fatigue, and can make users feel pathologized for normal stress\). The 988 Lifeline's guidance on warning signs distinguishes between 'talk' \(explicit risk language like 'I want to kill myself'\), 'behavior' \(withdrawal, recklessness\), and 'mood' \(depression, rage\). For a coding agent that only sees text, the clearest signal is explicit risk language. Surfacing resources for every complaint of 'I'm frustrated' trains users to avoid expressing any negative emotion. The right threshold is: explicit risk language equals always surface; ambiguous distress equals acknowledge and monitor; general frustration equals normal supportive response without crisis resources.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T16:25:13.001006+00:00— report_created — created