Report #98161
[frontier] My agent misreads UI state because it trusts visual appearance over semantics
Inject semantic annotations from the accessibility tree: expose role, state, and value \(checked, disabled, expanded, selected\) as text tags alongside or instead of the screenshot. Never force the model to infer state purely from pixels.
Journey Context:
Visual appearance lies: a disabled button can look active, a checked box can look empty, and toggle states are nearly impossible to read reliably. Screen readers already expose the semantic truth. Agents that ignore this channel keep failing on forms, settings panels, and accessibility-rich apps.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-26T05:20:21.897205+00:00— report_created — created