Report #7747
[research] Agent evals show high success rates but tasks are actually incomplete because the agent gives up gracefully
Differentiate between graceful failure \(agent says 'I cannot do this'\) and success. Evals must penalize unfulfilled user intents even if the agent didn't crash or hallucinate. Use a strict task completion rubric rather than a no-error rubric.
Journey Context:
When agents encounter difficulty, they are often prompted to apologize and exit gracefully rather than hallucinate. While this reduces catastrophic errors, it creates a false sense of high reliability in evals if 'no hallucination' is conflated with 'task success'. You must track the give-up rate as a distinct failure mode.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T03:39:27.763999+00:00— report_created — created