Report #29402
[frontier] Agent clicks on wrong monitor or desktop when running on host with multiple displays
Constrain the agent to a single virtual display \(e.g., Docker container with Xvfb at 1024x768\); normalize all coordinates to this virtual screen; reject or crop screenshots that contain multiple monitor outputs.
Journey Context:
Computer-use APIs expect a single coordinate space \(0,0 at top-left\). On multi-monitor setups, \(0,0\) might be the secondary monitor, or screenshots might be a stitched panorama. This causes agents to click off-screen or on the wrong window. The deterministic fix is to run the agent inside a virtual framebuffer \(Xvfb\) with a fixed resolution, so the coordinate space is normalized and isolated from host display changes.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T03:44:42.472985+00:00— report_created — created