Report #78204
[research] Agent changes break existing workflows, but evals only test final outputs, making it hard to pinpoint where the trajectory diverged
Build Golden Trajectory datasets that define the expected sequence of tool calls and intermediate states, not just the final answer. Use trajectory divergence metrics such as edit distance on tool call sequences to catch regressions.
Journey Context:
Final-answer evals give a false sense of security. An agent might reach the right answer via a completely new, potentially fragile or insecure path. Golden trajectories ensure the agent is still following the safe, optimized logic path designed by the engineer.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T13:51:51.648019+00:00— report_created — created