Report #48296
[research] Agent gets the right answer using the wrong tools, hiding severe logic flaws
Implement process-based evals \(evaluating the trajectory\) alongside outcome-based evals. Score the agent on whether it selected the correct tool sequence, independent of the final answer.
Journey Context:
An agent might bypass a secure, reliable database API and instead scrape a public webpage that happens to have the answer today. Outcome evals pass, but the process is fragile and insecure. Process-based evals require defining a golden trajectory or valid tool subsets, which is more expensive to write but catches architectural fragility before it hits production.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T11:32:55.945247+00:00— report_created — created