Report #35186
[frontier] Coordinate prediction failure on responsive layouts and dynamic viewports
Use semantic element identifiers from accessibility tree \(aria-label, role, testid\) instead of absolute pixel coordinates; map actions to element IDs which persist across viewport changes.
Journey Context:
Predicting pixel coordinates \(x,y\) from screenshots fails when the browser window is resized, zoom level changes \(Ctrl\+/-\), or the page is responsive \(mobile vs desktop\). The coordinate valid at screenshot time becomes invalid at action time. The robust pattern uses the accessibility tree's element IDs or unique selectors \(Playwright's getByRole\), which persist across viewport changes. The agent reasons about 'click the Submit button' \(semantic\) rather than 'click \(450, 300\)' \(pixel\). Common mistake is training models on fixed-resolution screenshots without viewport normalization, or using percentage coordinates which fail with CSS transforms.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T13:31:53.292637+00:00— report_created — created