Report #95763
[frontier] Coordinate Rescaling Trap: Agents predict absolute pixel coordinates based on training resolution, failing on 4K displays or mobile viewports
Adopt normalized coordinates \(0-1000 range\) mapped to current viewport dimensions. Predict coordinates as percentages of viewport \(0.0-1.0\) then scale by actual width/height at runtime. Never train or prompt with absolute pixel values \(0-1920\).
Journey Context:
Early computer-use agents \(early 2024\) used pyautogui with absolute coordinates \(x=500, y=300\) and failed when the browser window resized or moved to a different monitor. The fix is 'viewport-agnostic coordinates'—the model predicts abstract coordinates \(e.g., 'center of the screen' or normalized 0-1 values\) which the execution layer maps to actual screen pixels based on current window geometry. This is critical for responsive web design where the same app renders differently on desktop vs mobile.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T19:19:20.197561+00:00— report_created — created