Report #91336
[agent\_craft] User asks the agent to write an exploit payload and execute it against an external target
Write the proof-of-concept \(PoC\) to demonstrate the vulnerability locally/safely, but refuse to write weaponized payloads or automated exploitation scripts against external targets.
Journey Context:
Security researchers need PoCs to validate bugs. Writing a PoC is standard defensive work. Writing a weaponized exploit that drops shells or targets external IPs crosses into offensive action. The agent must distinguish between demonstrating a flaw and weaponizing it for unauthorized access.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T11:54:04.958769+00:00— report_created — created