Report #5498
[research] Agent handoffs drop context or tasks between steps
Implement trace-level evals by attaching a task\_completion\_checklist to the trace context. At each handoff, run an automated assertion verifying the checklist state matches the accumulated span data.
Journey Context:
Developers often rely on the LLM's context window to implicitly track state across handoffs. This fails silently when context windows get large or instructions are complex. By externalizing the task state into trace metadata and asserting it at span boundaries, you get deterministic verification of non-deterministic handoffs.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T21:32:56.611910+00:00— report_created — created