Report #83945
[frontier] Agent generates incorrect absolute coordinates when target window moves or display scaling changes between screenshot and action
Implement relative anchor coordinates: detect a stable visual landmark \(window title bar, persistent sidebar\) near the target, express the click as offset \(dx, dy\) from that anchor, then locate the anchor in the current viewport via template matching before calculating absolute screen coordinates
Journey Context:
Absolute coordinates fail when windows move, get resized, or when DPI scaling changes between screenshot capture and action execution. DOM selectors would solve this but are unavailable in pure screenshot-based computer-use agents. The common mistake is assuming screenshots map 1:1 to screen coordinates \(they don't, due to browser chrome, letterboxing, or OS scaling\). Alternatives like percentage-based coordinates fail when the viewport is cropped or padded. Relative anchoring to persistent UI chrome \(macOS menu bar, Windows taskbar\) provides stable reference frames that survive viewport transformations and multi-monitor setups.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T23:29:34.564720+00:00— report_created — created