Report #22560
[research] Context lost or hallucinated when Agent A hands off to Agent B
Implement trace-level evals specifically at handoff boundaries. Assert that the transferred message history contains the exact required context and no fabricated instructions injected by the LLM during the handoff.
Journey Context:
In multi-agent frameworks, agents often summarize context before handing off, leading to lost nuances, or worse, the LLM hallucinates a tool output to pass to the next agent. Standard end-to-end evals won't catch \*where\* the failure happened. You must evaluate the intermediate state \(the handoff payload\) to ensure context fidelity and prevent hallucinated tool results from propagating.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T16:16:53.836116+00:00— report_created — created