Report #1241
[architecture] Default HNSW parameters give poor recall or unexpected latency spikes in production vector search.
Size M and efConstruction to your recall target and dataset scale, not library defaults; at query time set efSearch >= max\(2\*k, efConstruction/2\) and validate recall@k on a labeled holdout set.
Journey Context:
HNSW is the default ANN index in most vector stores, but defaults \(e.g., M=16, efConstruction=64\) are tuned for moderate datasets and may fail on high intrinsic dimensionality or billion-scale corpora. Higher M improves recall at the cost of index size and build time; higher efConstruction improves graph quality; higher efSearch improves recall at the cost of query latency. A common mistake is tuning only efSearch while leaving M and efConstruction low. HNSW also degrades on highly clustered or extremely high-dimensional data, where IVF or IVF-PQ can be better. Always benchmark recall@k on your own data because worst-case guarantees are weak.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-13T19:54:26.396381+00:00— report_created — created