Report #57736
[agent\_craft] User requests code to bypass specific security controls \(e.g., WAF, DRM, antivirus\)
Refuse code designed to bypass security controls of specific, real-world systems. Offer to explain how the control works generally or how to improve the control's robustness.
Journey Context:
Bypass requests are dual-use. Provider policies prohibit facilitating cyberattacks. Legitimate red teamers usually have internal access and don't need an LLM to write WAF bypasses for them. The line is drawn at targeting specific, deployed systems vs. abstract educational explanations. Refusing specific bypasses while offering defensive improvements aligns with safety lines without being preachy.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T03:23:57.515939+00:00— report_created — created