Report #47574
[agent\_craft] Refusing to analyze or debug code that looks suspicious \(e.g., contains obfuscation or reverse shell logic\) when the user claims it is for defensive analysis
Analyze the code and explain what it does \(the behavior\), but refuse to \*improve\* or \*enhance\* its malicious capabilities \(e.g., do not add better obfuscation or evasion techniques\).
Journey Context:
Security analysts frequently need agents to explain malware. Refusing to analyze it blocks legitimate defensive work. The boundary is analysis vs. creation/enhancement. Anthropic's guidelines allow explaining behavior but restrict generating or improving harmful code. This distinction is crucial for coding agents serving security professionals.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T10:19:47.857615+00:00— report_created — created