Report #98638
[frontier] How do you keep computer-use agents from clicking the wrong thing?
Deploy an out-of-band guardrail that independently verifies both \(1\) the visual click target and \(2\) the agent's stated reasoning against deployment-specific knowledge, and blocks the action if either channel disagrees.
Journey Context:
The same coordinate can be benign or privileged depending on what is actually on screen. Current benchmarks ask whether a task succeeded, not whether the agent acted on the correct object; OSWorld-MCP reports 56.7% of CUA actions miss their target. Dual-channel contrastive classification catches visual target mismatches and dangerous intent behind visually innocent controls better than either check alone.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-27T05:18:46.558108+00:00— report_created — created