2026-04-21 / Guerin Green / Open Brain System

pgvector Setup for an Open Brain (The Complete Walkthrough)

Postgres with the pgvector extension handles embedding search at the price point and latency an open brain needs. Under ten minutes to stand up. The only schema you'll read.

Why pgvector (And Not a Dedicated Vector DB)

Unified Data Architecture

Implementing a pgvector open brain setup eliminates the architectural complexity of maintaining separate databases for relational data and embeddings. While dedicated vector stores like Pinecone, Weaviate, or Qdrant offer specialized scaling, they introduce synchronization overhead and additional billing cycles.

By keeping vectors within PostgreSQL, developers utilize standard SQL joins, ACID transactions, and Row-Level Security (RLS) on the same record containing the embedding. This colocation prevents data drift between a metadata store and a vector index.

Performance at Personal Scale

For an open brain system managing 10k to 10M rows, PostgreSQL performance is sufficient. Sub-100ms query latency is achievable on datasets of 100k rows when utilizing proper indexing. Cost efficiency is significantly higher; a Supabase free tier or a basic Postgres instance under $10/month typically suffices for personal knowledge bases.

Dedicated vector databases are optimized for billion-scale vectors, but for agentic memory and RAG, the flexibility of SQL outweighs the marginal latency gains of specialized engines.

Enable and Schema

Activating pgvector

The first step in a pgvector open brain setup is enabling the extension within the Supabase SQL Editor or any Postgres instance with the extension installed.

CREATE EXTENSION IF NOT EXISTS vector;

Memory Table Design

The schema must align with the dimensions of the chosen embedding model. Using 1536 dimensions ensures compatibility with OpenAI's text-embedding-3-small. Reducing this to 768 dimensions, as seen with Nomic Embed, halves the storage requirements per vector.

CREATE TABLE memories (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  content TEXT NOT NULL,
  embedding VECTOR(1536),
  metadata JSONB,
  source TEXT,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

CREATE INDEX idx_memories_created_at ON memories (created_at DESC);

The B-tree index on created_at is critical for recency filtering, allowing the system to prioritize recent memories over stale data during retrieval.

The Index Decision: IVFFlat vs HNSW

Comparing Indexing Strategies

Choosing between IVFFlat and HNSW depends on the trade-off between build speed, memory consumption, and query recall. For a pgvector open brain setup at personal scale (under 1M rows), both are viable, but they behave differently.

Feature	IVFFlat	HNSW
Build Speed	Fast	Slow
Memory Use	Low	High (2-4x)
Query Latency	Moderate	Very Low
Recall Accuracy	Approximate	High

IVFFlat Implementation

IVFFlat partitions vectors into lists. A common rule of thumb is to set lists to the square root of the total row count. For prototypes, a list count of 100 typically builds in under 30 seconds.

CREATE INDEX ON memories USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);

HNSW Implementation

Hierarchical Navigable Small World (HNSW) graphs provide superior recall and faster query speeds for high-QPS workloads, though they require more RAM.

CREATE INDEX ON memories USING hnsw (embedding vector_cosine_ops);

Query Patterns

Semantic Similarity Search

The <=> operator calculates cosine distance. Because it measures distance, a smaller value indicates higher similarity. To convert this to a similarity score (0 to 1), subtract the result from 1.

SELECT content, 1 - (embedding <=> '[your_query_vector]') AS similarity 
FROM memories 
ORDER BY embedding <=> '[your_query_vector]' 
LIMIT 5;

Hybrid and Filtered Retrieval

Combining semantic search with SQL filters allows for precise memory recall. This is essential for a pgvector open brain setup to avoid context window pollution.

Metadata Filtering:

SELECT content FROM memories 
WHERE source = 'obsidian_vault' 
ORDER BY embedding <=> '[your_query_vector]' 
LIMIT 5;
Hybrid Keyword Search: Using websearch_to_tsquery alongside vectors allows the system to find exact terms that embeddings might miss.
SELECT content FROM memories 
WHERE websearch_to_tsquery('english', 'Project X') @@ to_tsvector('english', content)
ORDER BY embedding <=> '[your_query_vector]' 
LIMIT 5;
Operational Notes
Production Stability
Maintaining a pgvector open brain setup requires attention to database health and latency. For those using Supabase, Point-in-Time Recovery (PITR) is recommended to prevent data loss during schema migrations.
To handle concurrent AI agent requests, use connection pooling via PgBouncer. This prevents the overhead of creating new TCP connections for every embedding lookup, which can otherwise spike latency.
Performance Monitoring
The target p99 latency budget for an agent-query loop is under 150ms at 100k rows. If queries exceed this threshold, the following actions are required:
Migrate from IVFFlat to HNSW indexes.
Upgrade the Supabase instance tier to increase available RAM for index caching.
Implement Row-Level Security (RLS) if the memory system serves multiple users to ensure data isolation without sacrificing speed.

Questions answered

What readers usually ask next.

How do I enable pgvector in Supabase?

Navigate to the Supabase Dashboard, open the SQL Editor, and execute the command `CREATE EXTENSION IF NOT EXISTS vector;`. This activates the pgvector extension on your PostgreSQL instance, allowing you to store and query high-dimensional embeddings.

What dimensions should I use for embeddings in pgvector?

Dimensions must match your specific embedding model. For example, OpenAI's `text-embedding-3-small` requires `VECTOR(1536)`, while smaller models like Hugging Face's `gte-small` will require different dimensions based on their architecture.

What is the difference between IVFFlat and HNSW indexes in pgvector?

IVFFlat is memory-efficient and suitable for smaller datasets but requires periodic vacuuming. HNSW provides significantly faster approximate nearest neighbor (ANN) search and higher recall, making it the preferred choice for production RAG systems despite higher RAM usage.

How fast is pgvector performance at 100k rows?

With an HNSW index, query latency remains extremely low (milliseconds) even at 100k rows. Without indexing, you are forced to perform a full table scan, which will cause significant performance degradation as the dataset grows.

Can I use pgvector in Postgres without using Supabase?

Yes. pgvector is an open-source PostgreSQL extension that can be installed on any self-hosted Postgres instance or managed providers like AWS RDS and Azure Database for PostgreSQL that support the extension.

How much does pgvector storage cost in Supabase?

pgvector uses standard PostgreSQL storage. Costs are based on the total disk space consumed by your vectors; since embeddings are arrays of floats, large datasets with high dimensions (e.g., 1536) will increase your database size and associated storage fees.

What is the SQL syntax for cosine similarity in pgvector?

Use the `<=>` operator to calculate cosine distance. To find the most similar items, use `ORDER BY embedding <=> '[your_vector]' LIMIT 10;`, where a smaller distance indicates higher similarity.

How do I implement hybrid search with pgvector?

Combine semantic vector search with traditional keyword search. Use the `<=>` operator for embeddings and a GIN index on a `tsvector` column for full-text search, then merge the results using a technique like Reciprocal Rank Fusion (RRF).

Does pgvector support metadata filtering?

Yes. Because it is integrated into PostgreSQL, you can use standard `WHERE` clauses to filter by metadata (e.g., `user_id`, `timestamp`, or `importance`) in the same query as your vector similarity search.

How does pgvector compare to Pinecone?

Pinecone is a specialized, serverless vector database designed for massive scale. pgvector offers the advantage of keeping your embeddings and relational data in one place, eliminating the need for complex data synchronization between two different systems.

Can pgvector handle 10 million rows?

Yes, provided you use HNSW indexes and have sufficient RAM to hold the index. For datasets of this size, performance depends heavily on your hardware configuration and the efficiency of your indexing strategy.

What embedding model works best with pgvector?

The 'best' model depends on your use case. OpenAI models are industry standards for general purpose RAG, while Hugging Face models like `gte-small` are excellent for those seeking open-source alternatives or lower latency via local hosting.