Report #67725
[research] Multi-agent system loses context or hallucinates state during handoffs
Implement trace-level evals that assert the presence of required context keys in the payload passed between agents, rather than just evaluating the final output.
Journey Context:
When Agent A hands off to Agent B, it often omits critical state \(like user ID or previous step result\) because the LLM summarized the context poorly. If you only eval the final answer, you cannot isolate where the context was lost. Tracing handoff payloads and asserting required fields at the span level isolates context-loss bugs instantly.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T20:09:21.455484+00:00— report_created — created