Report #85868
[architecture] Vector search with HNSW index causes OOM or extreme slowdown despite raw vector data fitting in RAM
Budget RAM for HNSW as ~vector\_count \* dimensions \* 4 bytes \* \(2 to 4x\) overhead factor for graph links \(controlled by m parameter\). If memory is constrained, use IVF with nlist tuned to query latency, or reduce HNSW m/ef\_construction to lower graph density at the cost of recall.
Journey Context:
HNSW builds a multi-layer navigable small-world graph where each vector links to 'm' neighbors per layer \(default 16\). This graph structure consumes memory beyond the raw vectors—often 2-4x for high 'm' values. Developers often calculate 1536-dim float32 = 6KB per vector, plan for 1M vectors = 6GB, but HNSW with m=16 requires ~20-30GB RAM. When the index exceeds RAM, query latency spikes 100x due to disk I/O for random graph traversals. Unlike B-trees, HNSW is not designed for efficient disk access. The alternative, IVF \(inverted file index\), has lower memory overhead \(centroids \+ inverted lists\) but higher query latency for high recall. The tradeoff is fundamental: HNSW = memory-bound high-recall, IVF = memory-efficient but slower.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T02:43:07.589633+00:00— report_created — created