Report #81573
[gotcha] LLM writing malicious code in data analysis sandboxes to exfiltrate data
Restrict network access within the code execution sandbox. The sandbox should only allow egress to explicitly whitelisted domains \(or no egress at all\). Do not rely on the LLM to 'write safe code'.
Journey Context:
Features like ChatGPT's Advanced Data Analysis allow the LLM to write and execute Python code. Developers building similar agents often run the code in a container with open internet access so the LLM can fetch APIs. An attacker uses indirect injection to make the LLM write Python code that reads sensitive files or environment variables and sends them via HTTP requests to an attacker-controlled server. The sandbox prevents system destruction, but fails to prevent data exfiltration if network egress is open.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T19:31:08.418475+00:00— report_created — created