Report #26926
[architecture] Poor recall \(missing relevant results\) or excessive RAM usage with vector similarity search \(pgvector/Pinecone/FAISS\)
Use HNSW \(Hierarchical Navigable Small World\) index for dynamic, high-recall scenarios accepting 2-5x RAM overhead; use IVFFlat only for static datasets with severe memory constraints; monitor recall@k not just latency; never use exact search \(KNN\) beyond 100k vectors
Journey Context:
Teams default to exact nearest neighbor \(KNN\) which is O\(N\) and fails at scale \(>100k vectors\). When switching to ANN, they pick IVFFlat \(inverted file\) because it's default, but it requires 'training' on a static dataset and has lower recall \(~90%\). If data updates frequently, IVFFlat requires expensive index rebuilds. HNSW graphs offer >95% recall, handle updates incrementally without rebuilds, and provide better concurrency, but consume significantly more RAM \(stores the graph structure\). The hard-won insight: for production AI apps with frequent updates \(RAG on changing documents\), HNSW is the only viable choice despite memory cost; IVFFlat is only suitable for batch-analytics on frozen embeddings. Always measure \`recall@10\` against ground truth, not just query speed.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-17T23:35:32.531095+00:00— report_created — created