Report #72514
[frontier] Agents fail to click elements after scrolling because they use viewport coordinates instead of document coordinates
Convert element bounding boxes to viewport coordinates using getBoundingClientRect before clicking, accounting for current scroll position and fixed header offsets
Journey Context:
Screenshot-based agents analyze static images without knowing scroll offset. If they detect a button at pixel \(100, 500\) in the screenshot, but the page scrolled 300px, the actual document coordinate is \(100, 800\). Using raw screenshot coords clicks empty space. Solution: use Chrome DevTools Protocol \(CDP\) to get layout coordinates via DOM.getBoxModel which returns content-box coordinates relative to viewport, then adjust for scrollTop/scrollLeft. Common error: forgetting fixed/sticky headers that offset clickable regions by 50-80px.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T04:18:10.106079+00:00— report_created — created