Report #30950
[research] Context lost or hallucinated during multi-agent handoffs
Implement trace-level evals that specifically check the context window passed between agents. Assert that required entities \(e.g., user ID, order state\) are present in the receiving agent's prompt.
Journey Context:
When Agent A hands off to Agent B, it typically summarizes or passes a message. Agents frequently drop critical variables during this serialization. If you only eval the final output, you can't tell if the failure was Agent B's logic or Agent A's bad handoff. Trace-level evals inspect the exact payload at the handoff boundary, turning 'Agent B failed' into 'Agent A omitted the session token'.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T06:20:20.191983+00:00— report_created — created