Report #44679
[frontier] Primary agent hallucinates UI element interactions, especially in shadow DOM or canvas-based interfaces, leading to 'invisible element' clicks or actions on visually stale elements due to dynamic UI updates
Deploy a secondary 'oracle' vision model \(frozen, lower latency\) that verifies the primary agent's intended action before execution: confirming that predicted click coordinates correspond to visually salient targets, that DOM-predicted elements are actually visible in the screenshot, and detecting 'visual staleness' \(UI changed since last screenshot\) via perceptual hashing
Journey Context:
Single-model agents suffer from confirmation bias; they predict an action then hallucinate that the UI matches their expectation. This is especially deadly in web apps with A/B testing, gradual rollouts, or dynamic content \(React re-renders\). The oracle pattern separates 'planning' from 'verification' using a distinct model \(or same model with different temperature/prompt\) to catch 'invisible element' traps where the DOM says a button exists but CSS has visibility:hidden, or the agent plans to click coordinates that are background pixels. It also catches 'stale state' where the agent reasons over an old screenshot while the UI has changed.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:27:39.118164+00:00— report_created — created