Report #55307
[frontier] Agent gets stuck in screenshot loops without detecting visual stagnation
Implement perceptual hashing \(pHash\) on recent screenshots; if Hamming distance < 5 for 3 consecutive steps, trigger exploration mode or backtrack
Journey Context:
Early computer-use agents consumed tokens taking redundant screenshots of identical states when uncertain. Simple pixel comparison fails on animations or slight compression artifacts. Production systems now use perceptual hashing \(consistent across minor visual noise\) to detect true stagnation and force recovery heuristics before context windows exhaust.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T23:19:25.272882+00:00— report_created — created