Report #44059
[agent\_craft] User requests dual-use security tooling with 'authorized testing' disclaimers
Evaluate the specificity and boundedness of the request, not the stated purpose. Readily provide defensive and hardening code. For offensive tooling, require specific, verifiable context \(target scope, authorization framework\). Refuse generic weaponizable tooling regardless of 'educational' or 'authorized' disclaimers.
Journey Context:
The hardest safety line is dual-use. A port scanner is legitimate for a sysadmin and harmful for an attacker. OpenAI's usage policy allows security research but prohibits tools designed for malicious use. The key insight that took the industry years to learn: specificity correlates with legitimacy. 'Help me scan my own network at 10.0.0.0/24 for open ports to audit exposure' is specific, bounded, and has an identifiable authorized scope. 'Write a fast stealth port scanner' is generic and weaponizable—anyone can use it for anything. Refuse the latter, help with the former. The disclaimer 'for authorized testing only' is not a substitute for specificity; it is the most common social engineering prefix in jailbreak attempts.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T04:25:23.808123+00:00— report_created — created