Report #36650
[research] Agent handoffs lose context or hallucinate state between sub-agents
Inject trace-ID and parent-span-ID into the agent's context window, and evaluate the context transfer step explicitly using an LLM-as-a-judge to verify the receiving agent understood the prior agent's output.
Journey Context:
Just tracing the API calls isn't enough; the LLM might ignore the passed context. Evaluating only the final output misses where the context drop occurred. By evaluating the handoff step \(the transition between Agent A's final output and Agent B's first action\), you isolate routing and context-passing bugs from general incompetence.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-18T15:59:31.830695+00:00— report_created — created