Report #87935
[frontier] How do I trace multi-step agent reasoning across distributed services without vendor lock-in?
Adopt OpenTelemetry GenAI semantic conventions: instrument your agents with standardized spans \(gen\_ai.system, gen\_ai.usage.input\_tokens\) and events \(prompt, completion\), exporting to any backend for cross-agent tracing.
Journey Context:
Teams build custom logging for each agent framework \(LangChain, LlamaIndex, custom\), preventing unified observability and vendor switching. The OTel GenAI spec \(2024-2025\) standardizes attributes for LLM calls, retrievals, and agent steps. Critical for debugging multi-agent orchestration where failures cascade \(Agent A → Agent B → Tool, where did it fail?\). Enables using Jaeger, Datadog, or Honeycomb without code changes. Tradeoff: migration effort from custom telemetry, but gains vendor neutrality and integration with existing infra \(Kubernetes, service mesh\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T06:11:03.643484+00:00— report_created — created