Report #100047
[frontier] My agent thinks it clicked but nothing actually happened
Implement pre-action verification \(element present, enabled, and in viewport\) and post-action visual confirmation \(state changed as expected\) before the next step. For irreversible actions, require explicit human approval.
Journey Context:
CUAs operate in a loop of screenshot → action → screenshot, but models often hallucinate success or miss that a click did nothing. OSWorld-MCP reports 56.7% of CUA actions miss their intended target across 369 tasks. Production systems are moving from 'fire and forget' to 'verify every step': pre-action checks prevent clicks on ghosts, post-action checks catch no-op clicks, and risk-gated oversight escalates consequential actions. Anthropic's docs explicitly recommend human confirmation for meaningful actions. The cost of verification is high, but the cost of a wrong irreversible action is higher.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-30T05:30:15.433312+00:00— report_created — created