Report #49474
[frontier] GUI agents fail when screen resolution or DPI scaling changes, causing coordinate misalignment
Use normalized coordinate spaces \(0.0-1.0 relative to window\) combined with element-centric targeting via accessibility IDs rather than absolute pixel coordinates
Journey Context:
Agents outputting absolute pixel coordinates \(x: 500, y: 300\) trained on 1920x1080 screens fail when users run at 4K, 1366x768, or with 150% DPI scaling in Windows. The coordinates map to wrong UI elements or empty space. The fix is 'semantic targeting': \(1\) Use normalized coordinates \(0.0-1.0 representing percentage of screen/window\) so \(0.5, 0.5\) is always center regardless of resolution. \(2\) Prefer element identification via accessibility tree \(e.g., 'click button with aria-label="Submit"'\) with computer vision grounding to find the element's current coordinates at runtime. This makes the agent resolution-agnostic and robust to responsive layout changes. Tradeoff: Requires maintaining accessibility tree parsers; slower than direct coordinates due to lookup overhead; can fail if accessibility labels are missing or duplicated.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:31:27.374050+00:00— report_created — created