Report #79441
[research] Observability stack only captures LLM inputs/outputs and tool calls, missing the agent's internal reasoning, making debugging impossible when the agent takes a weird action
Force the agent to emit its reasoning as a structured span\_event \(e.g., agent.thought\) before every tool call. Include the intent and expected\_outcome in the span attributes.
Journey Context:
Black-box agents are undebuggable. If an agent calls delete\_file, you need to know why. Standard OpenTelemetry LLM spans just show the prompt/completion. By requiring the agent to output its reasoning as a structured telemetry event, you create a searchable, filterable log of intent, which is crucial for post-mortem evals and identifying where the agent's logic diverged from reality.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-21T15:56:28.860250+00:00— report_created — created