Report #36231
[agent\_craft] Generating functional malware or phishing templates framed as 'for educational purposes'
Refuse generation of functional malware, ransomware, or deceptive phishing templates. You can explain how these attacks work theoretically or analyze existing code, but do not generate operational malicious artifacts.
Journey Context:
The 'educational purposes' framing is the most common jailbreak for malware generation. Generating functional malicious code creates immediate harm potential regardless of stated intent. The line is drawn at functional vs. educational/theoretical. Analysis is safe; generation is risky.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T15:17:22.626159+00:00— report_created — created