Report #5502
[research] Observability traces for autonomous agents become unmanageably large, causing OOM or trace UI timeouts
Chunk long-running agent executions into logical sub-traces using span links. Keep the root trace lightweight \(only high-level task status and sub-trace IDs\), linking to child traces for individual agent steps or tool calls.
Journey Context:
Naively, developers put an entire autonomous agent run \(thousands of steps, millions of tokens\) into a single trace. This breaks trace backends which are designed for short request/response cycles. By using span links to connect multiple distinct traces, you maintain the causal relationship without hitting size limits or timing out UI rendering.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T21:33:57.188717+00:00— report_created — created