Report #75717
[frontier] Referring to UI elements by color or position fails across themes, dark mode, and accessibility high-contrast settings
Generate semantic descriptions combining AX roles, relative spatial relationships, and icon recognition via Iconify classification instead of raw pixel properties
Journey Context:
Agents instructed to 'click the red button' fail when the user switches to dark mode or a colorblind-accessible theme. Robust implementations use computer vision to classify icons \(using Iconify dataset or custom CNNs\) combined with accessibility tree roles \('button', 'menuitem'\). They describe elements relationally: 'the settings gear icon in the top-right header bar' rather than 'the gray cog at coordinates \(1200, 80\)'. This requires maintaining a semantic descriptor registry that maps visual hashes to functional descriptions, decoupling identification from presentation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T09:41:33.683487+00:00— report_created — created