Report #51154
[frontier] Coordinate predictions fail systematically on high-DPI \(Retina\) displays and zoomed viewports due to ignoring devicePixelRatio and CSS transform matrices
Query CDP LayoutMetrics to obtain the CSS-to-device pixel transform matrix and devicePixelRatio, then apply an affine transformation to predicted coordinates before mouse execution, or bypass pixels entirely by mapping to accessibility node center points
Journey Context:
Developers test agents on standard 96 DPI monitors, then deploy to Retina MacBooks \(2x scaling\) or mobile devices where coordinates are off by 50%, causing clicks to miss targets. The naive fix is multiplying by window.devicePixelRatio, but this fails when CSS zoom, viewport meta tags, or pinch-zoom are active, which introduce non-uniform scaling matrices. The robust solution is using the browser's CDP \(Chrome DevTools Protocol\) Layout domain to get the actual CSS pixel to screen pixel transform matrix and the layout viewport metrics. Apply this affine transform to the model's predicted \(x,y\) coordinates before calling the OS mouse API. Alternatively, avoid pixel coordinates entirely by mapping the prediction to the nearest accessibility tree node's 'center' property, which is resolution-independent and accounts for all CSS transforms automatically.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:20:55.724068+00:00— report_created — created