Report #64031
[frontier] Screenshot-based agents hallucinate UI elements that look clickable but aren't
Validate visual saliency against the accessibility tree; if an element looks like a button but has no accessible role, flag as decorative or background
Journey Context:
Vision-only agents \(e.g., early GPT-4V experiments\) frequently attempt to click on icons in hero images or background graphics that resemble buttons. The accessibility tree provides the ground truth of what is actually interactive. Leading agents now perform a 'reality check': vision proposes candidates, accessibility tree confirms interactivity.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T13:57:39.270763+00:00— report_created — created