Report #13810
[architecture] Suboptimal vector search performance due to incorrect indexing algorithm selection for recall, memory, or concurrency requirements
Choose HNSW indexes for high-recall, low-latency OLTP workloads with sufficient RAM; choose IVFFlat for memory-constrained batch processing with tunable recall; avoid Flat/Brute force beyond 100k vectors
Journey Context:
Vector databases offer three primary approximate nearest neighbor \(ANN\) strategies: Flat \(brute force exact search\), IVFFlat \(inverted file with exact search within clusters\), and HNSW \(graph-based navigable small world\). Flat provides 100% recall but O\(n\) complexity; unacceptable latency beyond ~100k vectors. IVFFlat partitions vectors into Voronoi cells \(clusters\); query probes n\_probes nearest centroids. Tradeoff: lower memory than HNSW \(stores only centroids and raw vectors\), tunable recall via n\_probes, but slower for high-concurrency OLTP due to centroid probe overhead and requirement to retrain index after significant updates. HNSW constructs multi-layer proximity graphs enabling logarithmic search complexity. Offers highest QPS \(queries per second\), best recall@10 rates \(>95%\), and excellent concurrent update performance \(online index growth\). Tradeoff: High memory consumption \(stores graph layers \+ raw vectors; 2-3x vector memory overhead\), longer index build times than IVFFlat. Common error: choosing IVFFlat for real-time recommendation systems \(high latency under concurrent load\) or HNSW on memory-constrained edge devices \(OOM crashes\).
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T19:49:04.876967+00:00— report_created — created