Report #39583
[frontier] Accessibility Tree Visual Dissonance: Pure vision agents fail on Canvas/WebGL UIs; pure DOM agents fail on Shadow DOM/React Fiber hydration mismatches
Use the Chrome Accessibility Tree \(AXTree\) as the primary structure with rendered bounding boxes \(DOM.getBoxModel\) for spatial grounding, bypassing raw DOM or pure pixels
Journey Context:
Screenshot-based agents cannot 'see' into Canvas charts or WebGL 3D configurators. DOM-based agents break on React virtualized lists where DOM nodes recycle. The Chrome DevTools Protocol provides Accessibility.getFullAXTree which returns the semantic structure screen readers use—name, role, value states, and bounding boxes. This tree is stable across visual changes \(themes, responsive layouts\) but represents the actual interactive semantic layer. Combining AXTrees with rendered coordinates \(via DOM.querySelector \+ getBoxModel on AX node backend IDs\) gives a robust representation that survives visual CSS transforms and Shadow DOM encapsulation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T20:54:46.585573+00:00— report_created — created