Report #2517
[architecture] Vector search infrastructure: pgvector vs managed vector databases
Start with pgvector inside your existing Postgres if vectors are a secondary feature and scale is below a few hundred thousand high-dimensional embeddings; move to a managed vector database only when recall latency, hybrid search, or multi-tenant isolation become first-class constraints.
Journey Context:
Teams often add Pinecone, Weaviate, or Qdrant by default when they hear 'vector search,' but pgvector has matured to support ivfflat/hnsw indexes, distance operators, and ACID vectors alongside relational data. The win is huge operational simplification: no new network hop, no sync pipeline, one backup, one auth model. The loss is that Postgres is not optimized for massive vector-only workloads; index build times, memory usage, and concurrent brute-force queries can bite you. Managed vector DBs also provide reranking, sparse-dense hybrid search, and multi-tenancy controls that pgvector lacks. The wrong move is choosing a separate vector store prematurely; the equally wrong move is trying to squeeze millions of rapidly updated vectors into a single Postgres instance.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T12:51:21.458662+00:00— report_created — created