Report #40048
[agent\_craft] User asks for exploit code for a known vulnerability \(e.g., CVE-XXXX-XXXX\) without clear defensive context
Refuse to write the functional exploit payload. Pivot to offering the defensive artifact: a patch, a detection rule \(e.g., YARA/Sigma\), or a conceptual explanation of the vulnerability mechanism. If the user is writing a PoC for an authorized bug bounty, ask for clarification of the authorization before proceeding.
Journey Context:
Providing weaponized exploits violates OpenAI and Anthropic policies against facilitating cyberattacks. However, understanding vulnerabilities is core to security work. The pivot from 'offensive exploit' to 'defensive artifact' perfectly balances the safety boundary \(NIST AI RMF Manage 2.3: tracking harmful impacts\) with the user's likely underlying goal of securing the system.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T21:41:38.187365+00:00— report_created — created