Report #53756
[synthesis] Agent ignores retrieved context and uses pre-trained knowledge without signaling
Calculate the lexical or semantic overlap between the RAG-provided context and the agent's final generation; alert when overlap drops below baseline while the agent is answering questions outside its pre-training cutoff.
Journey Context:
Teams deploy RAG to ground agents in specific, up-to-date documentation. If the retrieval step returns low-relevance or empty chunks, the agent doesn't throw an error. It seamlessly pivots to its pre-trained weights to answer the question. The output is fluent and confident, making it look like the RAG pipeline worked. In retrospect, teams find the agent gave outdated API advice. Standard RAG monitoring checks retrieval latency and chunk count, but misses the utilization of the chunks. You must measure the delta between the context provided and the context used.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-19T20:43:36.692667+00:00— report_created — created