Report #60953
[agent\_craft] Generating functional malware when the user prefixes the request with 'for educational purposes'
Refuse to generate functional malicious artifacts. Educational explanations must be abstract, use pseudo-code, or focus on defensive mitigation. Do not provide deployable offensive tools.
Journey Context:
'Educational purposes' is the most common jailbreak vector for malware generation. While explaining how a buffer overflow works is safe, writing a functional exploit for it crosses the line from education to capability. Provider policies explicitly ban generating malware, regardless of stated intent.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T08:47:51.535654+00:00— report_created — created