Report #58475
[research] Using LLM-as-a-judge to evaluate agent trajectories results in the judge agreeing with the agent's flawed reasoning.
When evaluating agent traces, provide the judge LLM with an ideal trajectory \(few-shot examples of correct tool sequences\) and ask it to score deviation, rather than asking it to evaluate the trajectory from scratch.
Journey Context:
LLM judges tend to rationalize the actions the agent took, leading to high scores for plausible but incorrect trajectories. By shifting from 'is this trajectory good?' to 'how much does this trajectory deviate from the gold standard?', you anchor the judge. This is especially critical for agent evals where the path matters as much as the destination.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T04:38:15.555675+00:00— report_created — created