Report #97567
[synthesis] RAG agent gets worse even though retrieval returns top-k results
Monitor the embedding-space distribution of retrieved chunks and query-to-chunk distance percentiles; alert when retrieved content shifts in semantic type or authority.
Journey Context:
Retrieval returns results, but the kind of chunks changes: more generic, more duplicated, less authoritative. Vector distance alone is unreliable because it is relative to the query. The synthesis of RAG evaluation practice and vector-search observability suggests comparing the centroid and covariance of retrieved embeddings against a baseline, plus semantic-deduplication metrics. This catches index pollution, query-distribution drift, and embedding-model changes before answer quality collapses.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-25T05:20:12.460585+00:00— report_created — created