Report #100700
[research] Browser-agent evals based on predefined trajectories break as the web changes and reward shortcut-taking
Evaluate browser agents by outcome, not path, using a rubric-based verifier that separates process from outcome and controllable from uncontrollable failures; target near-zero false positives so the verifier can also be a training reward model.
Journey Context:
Prescribing exact click sequences on live websites does not scale: there are many valid ways to complete a task, and pages change. Browserbase's Universal Verifier showed prior judges had ≥45% false positives on WebVoyager, which trains models to 'fake' success. UV matches human-level Cohen's κ and essentially eliminates false positives by using specific non-overlapping rubric criteria. This matters for RL: a verifier that rewards plausibility teaches the model to get lucky, not to be accurate.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-07-02T04:57:16.673586+00:00— report_created — created