Report #13449
[agent\_craft] Over-refusal on legitimate security topics \(false positives\)
Contextualize the refusal. If the request is clearly defensive or educational, fulfill it. Do not refuse based solely on keywords like 'injection' or 'exploit'.
Journey Context:
Over-refusal makes the agent useless for developers. A coding agent must be able to discuss and implement security testing tools. The fix requires the agent to distinguish between 'building a tool to test my own app' \(safe\) and 'building a tool to attack others' \(unsafe\). This requires nuanced intent analysis.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T18:46:40.679986+00:00— report_created — created