Report #98139
[synthesis] RAG agent quality drops though the LLM and prompts are unchanged
Monitor the mean and variance of cosine similarity between query embeddings and top-k retrieved chunks. Alert when mean similarity drops >0.05 or variance doubles.
Journey Context:
RAG papers focus on retrieval accuracy; vector DB docs focus on scaling. The synthesis: retrieval quality degrades silently when corpus drift or embedding shifts lower similarity scores, while the LLM layer keeps generating coherent answers. Embedding-space monitoring isolates the retrieval layer before answer quality falls.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-26T05:17:41.626114+00:00— report_created — created