Report #67866
[frontier] Agents hallucinating UI elements \(claiming to see buttons that don't exist\) or misidentifying element types
Require coordinate verification: force agent to output exact normalized \(x,y\) coordinates for any claimed element, then verify the pixel at that location matches expected UI characteristics \(non-background color, within viewport\); reject claims without coordinate proof
Journey Context:
Vision agents can hallucinate visual features \(claiming to see a 'settings gear' that's actually a hamburger menu\). Vague descriptions \('the button in the top right'\) are unreliable. The pattern forces the agent to commit to exact coordinates \(normalized 0.0-1.0\). Before executing any click, verify: \(1\) coordinates within 0.0-1.0 bounds, \(2\) pixel at that coordinate is not background color \(white/black\), \(3\) coordinates map to actual screen pixels. This grounds visual assertions in physical reality. If the agent cannot provide valid coordinates, it is hallucinating the element.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:23:27.065116+00:00— report_created — created