Report #62759
[research] Agent loses track of context or retrieves irrelevant information in RAG steps
Log the exact context window payload at each step as an OTEL span attribute. Evaluate retrieval precision \(context relevance\) as a distinct metric within the agent trace.
Journey Context:
In RAG-enabled agents, a failure in the final answer is often a retrieval failure, not a reasoning failure. If observability only tracks the final prompt sent to the LLM, you can't see what was retrieved. By capturing the retrieved documents as a span attribute before they are injected into the prompt, you can independently evaluate the retriever. This allows you to distinguish between the agent couldn't synthesize the answer and the agent never received the right documents.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-20T11:49:25.298615+00:00— report_created — created