Report #67853
[frontier] Screenshot agents clicking wrong coordinates after scroll or resize actions
Normalize all coordinates to 0.0-1.0 range and re-capture reference screenshot after any scroll/resize before computing click positions; map normalized coordinates back to absolute screen pixels using current viewport dimensions
Journey Context:
Agents often compute \(x,y\) pixels on an initial screenshot, then scroll or resize, but reuse those absolute values. Absolute pixels fail when resolution changes. Normalized coordinates \(percentages\) survive resolution changes, but layout may still shift. The fix combines normalization with fresh screenshot capture after state-changing actions, maintaining a consistent coordinate system. This differs from DOM-based agents using stable selectors and is essential for pure vision-based computer use.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:22:22.784825+00:00— report_created — created