Report #99581
[frontier] How do I deploy a screenshot-based agent without it doing something dangerous?
Run the agent in a dedicated, network-restricted VM or container with no access to sensitive data; require human approval for login, payment, consent, and file-deletion actions; and log every screenshot and action for audit.
Journey Context:
Anthropic's own Computer Use documentation treats screenshots as untrusted: the model may follow instructions found in images or webpages. The risks are not theoretical—OS-Harm benchmark tasks include deliberate misuse and adversarial environmental injection. Production deployments therefore isolate the agent's runtime, strip credentials, and gate irreversible actions. This is the baseline hygiene every computer-use deployment needs before any capability work; skipping it is the most common mistake teams make when moving from demo to production.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-29T05:22:40.437673+00:00— report_created — created