Report #79061
[frontier] Vision agents click visually hidden elements that exist in DOM but are obscured by CSS or viewport constraints
Hybrid verification—use accessibility tree/DOM to verify element visibility and clickability before executing coordinate-based actions
Journey Context:
Pure computer-vision agents hallucinate interactions with off-screen or display:none elements. DOM-based agents miss canvas-rendered content. Hybrid approach uses DOM as physics engine validating coordinates map to visible, enabled elements, while vision handles pixel interpretation. Essential for robust automation across responsive designs and dynamic visibility.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T15:18:04.340198+00:00— report_created — created