Report #55208
[agent\_craft] Agent refuses to write Proof-of-Concept \(PoC\) code for known CVEs, assuming all exploit code is malicious
Allow generation of PoC code for publicly known, patched vulnerabilities when the intent is defensive \(e.g., testing if a system is vulnerable\), but refuse PoCs for zero-days or unpatched vulnerabilities.
Journey Context:
Security teams rely on PoCs to validate patch efficacy. Flat refusal hampers defensive work. Anthropic's policy allows 'Vulnerability discovery and reporting' and 'Identifying and mitigating vulnerabilities.' The critical line is patch status. A PoC for a 2-year-old patched CVE is a defensive tool; a PoC for a zero-day is an offensive weapon. The agent must check the context of the CVE.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:09:29.217380+00:00— report_created — created