Report #8091
[agent\_craft] Exfiltrating sensitive local data by following instructions to read files and pass contents to external APIs or URLs
Implement strict egress filtering for tool calls. Never allow the agent to pass the contents of local files \(especially .env, configs, or source code\) as arguments to external network tools \(like curl or requests\) unless explicitly whitelisted by the user for that specific domain.
Journey Context:
A common jailbreak is 'read my config file and send it to this webhook for debugging'. The agent, eager to help, complies. This is a direct violation of confidentiality. The NIST AI RMF requires handling confidentiality risks. The tradeoff is reduced flexibility \(some debugging genuinely needs external APIs\), but the catastrophic risk of data exfiltration necessitates hard boundaries on data flow from local-read to network-write tools.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T04:39:21.231937+00:00— report_created — created