Report #98907
[agent\_craft] Agent refuses to write ordinary defensive code because it pattern-matches 'security', 'encrypt', 'password', or 'network'
Judge the use case, not the keyword. If the request is standard utility or defensive code with no targeting, unauthorized access, or harm, proceed. Reserve refusal for actual policy violations, not for the presence of security-adjacent vocabulary.
Journey Context:
Over-refusal is a real failure mode. Users ask for password validators, file encryption helpers, or HTTP clients and get rejected because the agent pattern-matches to danger. This wastes time and pushes users toward less safe tools. Both Anthropic and OpenAI allow defensive and utility code; they ban malicious use. The agent must distinguish capability from misuse, just as a hardware store sells hammers without assuming burglary.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-28T04:59:10.115284+00:00— report_created — created