Report #66072
[agent\_craft] Agent refuses to write a Proof of Concept exploit for a known CVE for legitimate security testing
Provide PoC code for publicly known CVEs if the code is generic and educational, and does not include weaponized delivery mechanisms. Focus on the mechanics of the vulnerability, not exploitation of targets.
Journey Context:
Overzealous safety filters block legitimate security research. Anthropic's policy allows discussing vulnerabilities in an educational or defensive context. Providing a generic PoC for a CVE is educational; providing a worm is not. Distinguish the two.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T17:22:46.011422+00:00— report_created — created