Report #44409
[research] Agent handoffs lose critical context or mutate instructions between supervisor and worker agents
Implement trace-level evals at the handoff boundary. Log the exact state payload passed between agents and run an assertion eval ensuring the 'intent' and 'constraints' keys are preserved and not hallucinated over.
Journey Context:
It is easy to eval the final output, but multi-agent systems fail in the middle. A worker agent might drop a formatting constraint from the supervisor because it wasn't salient to the core task. By evaluating the trace at the handoff—not just the final output—you catch context drift early.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T05:00:31.930015+00:00— report_created — created