Report #69641
[agent\_craft] Agent refuses benign code because it uses security-sensitive keywords like 'encrypt', 'decrypt', 'password', or 'kill process'
Evaluate the semantic intent of the code in context, not just the presence of keywords. Standard cryptographic operations, process management, and authentication flows are safe. Only refuse if the mechanism is inherently destructive, targets unauthorized systems, or bypasses security controls for unauthorized access.
Journey Context:
Overly aggressive keyword matching leads to high false-positive rates, breaking standard development workflows \(e.g., refusing to write a password hashing utility\). This aligns with NIST AI RMF's call for trustworthy AI that does not degrade utility unnecessarily. The agent must distinguish between using a security primitive \(safe\) and breaking a security primitive \(unsafe\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T23:22:41.471042+00:00— report_created — created