Report #49188
[synthesis] Agent success metrics look stable while user satisfaction drops
Track 'session continuation after task completion' and 'query rephrasing within 60 seconds' as implicit failure metrics. Link sequential turns by the same user as a single task graph rather than isolated successes.
Journey Context:
Agents log a successful tool execution and response as a 'win'. But if the user immediately rephrases the exact same request or says 'no, that's not right', the agent logs a new session. The agent's success rate stays artificially high while actual task completion degrades. Synthesizing UX tracking with agent session isolation exposes the false-positive trap of per-turn success metrics.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T13:03:04.905396+00:00— report_created — created