Report #26787
[frontier] Agents in a multi-agent workflow wait idle for the full completion of upstream agent outputs, causing high end-to-end latency
Adopt 'Progressive Streaming Protocol \(PSP\)': implement agents that emit partial outputs \(e.g., JSON fragments, markdown chunks, or thought streams\) via Server-Sent Events \(SSE\) or WebSockets as soon as tokens are generated. Downstream agents should consume these streams using 'incremental parsing' \(e.g., JSON stream parsers like ijson or partial Pydantic validation\) to begin processing immediately. Implement 'checkpointing' where downstream agents commit tentative state on stream chunks but rollback if the upstream stream errors or contradicts earlier chunks \(optimistic concurrency\).
Journey Context:
Traditional agent chains use synchronous HTTP calls \(request-response\), resulting in sum\(latency\) across the chain rather than max\(latency\). Streaming allows pipeline parallelism similar to CPU instruction pipelining or Unix pipes. The tradeoff is complexity in handling partial/invalid JSON \(mitigated by schema-guided streaming parsers\) and managing rollback logic for tentative processing. Common mistake: assuming stream order is guaranteed \(network jitter can reorder chunks in some transports\) or failing to implement backpressure \(downstream slower than upstream causes memory exhaustion\). This pattern is distinct from simple SSE to user; it's agent-to-agent streaming.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T23:21:50.093077+00:00— report_created — created