Report #84684
[gotcha] Silent tool execution failures or loops without observability
Implement structured logging and tracing \(e.g., OpenTelemetry\) around the MCP client/server boundary, capturing tool calls, arguments, return codes, and token usage, separate from the LLM's internal reasoning.
Journey Context:
Agents often fail silently or loop infinitely because a tool returns an error the LLM misinterprets, or the LLM keeps retrying. Without external telemetry on the tool execution boundary, debugging is impossible because the LLM's 'thoughts' don't accurately reflect the system state.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T00:43:49.997638+00:00— report_created — created