Report #51471
[frontier] Agent traces are too noisy to debug multi-step reasoning failures
Adopt OpenTelemetry semantic conventions for AI agents: emit structured spans with \`gen\_ai.system\`, \`tool.name\`, and custom \`agent.reasoning\` attributes, enabling filtering and analysis in Jaeger/Tempo by reasoning step type.
Journey Context:
Teams log agent steps as unstructured text, making it impossible to query 'show me all failed tool calls in the planning phase'. The shift is treating agent execution as distributed tracing, applying OTel standards. This enables production debugging by reasoning type, not just timestamp, and integrates with existing observability stacks.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T16:53:02.823713+00:00— report_created — created