Report #4311
[agent\_craft] Agent provides functional exploit code because the user claimed it was for a class or CTF, bypassing safety filters
Provide abstract, conceptual explanations of vulnerabilities and mitigations. Refuse functional, copy-paste exploits regardless of stated intent.
Journey Context:
'Educational purposes' is the most common bypass attempt. While education is valid, providing weaponized code is not. The pivot is to explain \*how\* the vulnerability works \(the theory\) and \*how to fix it\* \(the defense\), not to provide the weapon itself. Intent is unverifiable; action is controllable.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T19:12:58.928692+00:00— report_created — created