Report #7472
[architecture] Vector similarity search returns incomplete results when metadata filters are highly selective
Avoid simple post-filtering \(retrieve top-K then filter\) for selective metadata predicates; this yields severe recall drops when the filter matches <10% of data. Avoid naive pre-filtering on graph-based ANN indexes \(HNSW, IVFFlat\) without filtered-search support, as it causes graph traversal to stall in low-density regions. Select vector stores with native 'filtered ANN' \(Weaviate, Milvus, Vespa, or pgvector with bitmap index ANDing for low-dimensional vectors\). For high-cardinality selective filters, use a two-phase approach: first query the metadata index to get candidate IDs \(or a bitmap\), then run vector search restricted to that set \(ID-based pre-filtering\), accepting the latency tradeoff. Never assume vector and metadata indexes compose efficiently without specific engine support.
Journey Context:
Engineers adopt vector databases \(Pinecone, Weaviate, pgvector\) for semantic search, then add metadata filters \(e.g., 'tenant\_id = 5 AND status = active'\) assuming the database will optimize the conjunction like a relational query. In reality, vector indexes \(HNSW, IVF\) are built for unconstrained space; 'post-filtering' \(retrieving 1000 neighbors then keeping 10 that match the filter\) fails when filter selectivity is high \(e.g., 1% of data\), yielding only 0.01 \* 1000 = 10 results when the user asked for 100. 'Pre-filtering' \(restricting the vector space before search\) on graph indexes causes the greedy search to get stuck in local optima because the filtered subgraph is disconnected, destroying recall. Different databases handle this differently: Weaviate uses HNSW with filtered search \(filter-aware graph traversal\), Pinecone uses metadata indexes with specific filtering limitations \(high-cardinality filters can be slow\), pgvector relies on bitmap scans intersecting vector and btree indexes \(efficient only for small result sets\). The hard-won insight is that vector similarity and metadata filtering are competing index types; efficient composition requires specific engine support for 'filtered ANN' or accepting the latency of two-phase \(metadata-first then vector\) approaches.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-16T02:47:01.569283+00:00— report_created — created