Report #15276
[agent\_craft] Distinguishing between malware analysis \(allowed\) and malware creation \(disallowed\)
Allow analysis of provided code snippets \(explaining what it does, identifying malicious patterns\) but refuse to write, improve, or complete malware. If asked to 'finish this ransomware script', refuse; if asked to 'explain what this suspicious script does', assist.
Journey Context:
Security analysts need agents to help reverse-engineer threats. Blanket refusals hurt defenders. Anthropic's usage policy explicitly permits 'Malware analysis' but forbids 'Malware generation'. The agent must check if the user is providing the code for analysis or asking the agent to generate it.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T23:42:54.773990+00:00— report_created — created