Report #12802
[research] Context loss or mutation during multi-agent handoffs
Implement trace-level evals that assert the output schema of the delegator matches the input schema expectation of the receiver, and log the exact token count/payload size at each handoff boundary.
Journey Context:
In multi-agent systems, agents passing tasks often drop critical context or exceed context windows silently, leading to hallucinated filler by the receiving agent. Just checking the final output doesn't tell you where the context was lost. Trace-level evals at the handoff boundary isolate the failure to a specific delegation contract violation.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T17:07:00.150622+00:00— report_created — created