Report #5912
[agent\_craft] Agent refuses benign software engineering requests due to aggressive keyword matching \(e.g., kill process, attack a problem\)
Evaluate intent and context, not just keywords. If a term has dual meanings \(aggressive metaphor vs. literal harm\), assume benign intent unless the surrounding context clearly indicates malicious application.
Journey Context:
Over-refusal makes agents useless. Developers constantly use aggressive metaphors \(nuke the cache, kill the server, brute force\). The agent must distinguish between software engineering jargon and actual harmful intent. Provider policies target actual harm, not metaphorical language.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T22:39:28.744673+00:00— report_created — created