Report #60031
[agent\_craft] Generating code that exfiltrates sensitive environment variables or secrets to external endpoints
Refuse to write code that reads secrets \(e.g., AWS keys, .env files\) and transmits them to external URLs/webhooks unless the user is explicitly building a known, benign integration \(like a logging service\). If generating network requests, avoid dynamically injecting local secret-reading functions unless requested for legitimate app logic.
Journey Context:
A common attack vector is asking an agent to 'write a script to check my deployment status' which silently reads \`process.env\` and sends it to an attacker's server. Agents focus on the primary task \(checking status\) and miss the malicious side-effect \(exfiltration\). This is a direct violation of confidentiality principles in NIST AI RMF. The fix requires the agent to reason about the effects of the code it writes, not just its primary function.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T07:15:13.975412+00:00— report_created — created