pgvector Setup for an Open Brain (The Complete Walkthrough)
Postgres with the pgvector extension handles embedding search at the price point and latency an open brain needs. Under ten minutes to stand up. The only schema you'll read.
Why pgvector (And Not a Dedicated Vector DB)
Unified Data Architecture
Implementing a pgvector open brain setup eliminates the architectural complexity of maintaining separate databases for relational data and embeddings. While dedicated vector stores like Pinecone, Weaviate, or Qdrant offer specialized scaling, they introduce synchronization overhead and additional billing cycles.
By keeping vectors within PostgreSQL, developers utilize standard SQL joins, ACID transactions, and Row-Level Security (RLS) on the same record containing the embedding. This colocation prevents data drift between a metadata store and a vector index.
Performance at Personal Scale
For an open brain system managing 10k to 10M rows, PostgreSQL performance is sufficient. Sub-100ms query latency is achievable on datasets of 100k rows when utilizing proper indexing. Cost efficiency is significantly higher; a Supabase free tier or a basic Postgres instance under $10/month typically suffices for personal knowledge bases.
Dedicated vector databases are optimized for billion-scale vectors, but for agentic memory and RAG, the flexibility of SQL outweighs the marginal latency gains of specialized engines.
Enable and Schema
Activating pgvector
The first step in a pgvector open brain setup is enabling the extension within the Supabase SQL Editor or any Postgres instance with the extension installed.
CREATE EXTENSION IF NOT EXISTS vector;Memory Table Design
The schema must align with the dimensions of the chosen embedding model. Using 1536 dimensions ensures compatibility with OpenAI's text-embedding-3-small. Reducing this to 768 dimensions, as seen with Nomic Embed, halves the storage requirements per vector.
CREATE TABLE memories (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
content TEXT NOT NULL,
embedding VECTOR(1536),
metadata JSONB,
source TEXT,
created_at TIMESTAMPTZ DEFAULT NOW()
);
CREATE INDEX idx_memories_created_at ON memories (created_at DESC);The B-tree index on created_at is critical for recency filtering, allowing the system to prioritize recent memories over stale data during retrieval.
The Index Decision: IVFFlat vs HNSW
Comparing Indexing Strategies
Choosing between IVFFlat and HNSW depends on the trade-off between build speed, memory consumption, and query recall. For a pgvector open brain setup at personal scale (under 1M rows), both are viable, but they behave differently.
| Feature | IVFFlat | HNSW |
|---|---|---|
| Build Speed | Fast | Slow |
| Memory Use | Low | High (2-4x) |
| Query Latency | Moderate | Very Low |
| Recall Accuracy | Approximate | High |
IVFFlat Implementation
IVFFlat partitions vectors into lists. A common rule of thumb is to set lists to the square root of the total row count. For prototypes, a list count of 100 typically builds in under 30 seconds.
CREATE INDEX ON memories USING ivfflat (embedding vector_cosine_ops) WITH (lists = 100);HNSW Implementation
Hierarchical Navigable Small World (HNSW) graphs provide superior recall and faster query speeds for high-QPS workloads, though they require more RAM.
CREATE INDEX ON memories USING hnsw (embedding vector_cosine_ops);
Query Patterns
Semantic Similarity Search
The <=> operator calculates cosine distance. Because it measures distance, a smaller value indicates higher similarity. To convert this to a similarity score (0 to 1), subtract the result from 1.
SELECT content, 1 - (embedding <=> '[your_query_vector]') AS similarity
FROM memories
ORDER BY embedding <=> '[your_query_vector]'
LIMIT 5;Hybrid and Filtered Retrieval
Combining semantic search with SQL filters allows for precise memory recall. This is essential for a pgvector open brain setup to avoid context window pollution.
Metadata Filtering:
SELECT content FROM memories WHERE source = 'obsidian_vault' ORDER BY embedding <=> '[your_query_vector]' LIMIT 5;Hybrid Keyword Search: Using
websearch_to_tsqueryalongside vectors allows the system to find exact terms that embeddings might miss.SELECT content FROM memories WHERE websearch_to_tsquery('english', 'Project X') @@ to_tsvector('english', content) ORDER BY embedding <=> '[your_query_vector]' LIMIT 5;Operational Notes
Production Stability
Maintaining a pgvector open brain setup requires attention to database health and latency. For those using Supabase, Point-in-Time Recovery (PITR) is recommended to prevent data loss during schema migrations.
To handle concurrent AI agent requests, use connection pooling via PgBouncer. This prevents the overhead of creating new TCP connections for every embedding lookup, which can otherwise spike latency.
Performance Monitoring
The target p99 latency budget for an agent-query loop is under 150ms at 100k rows. If queries exceed this threshold, the following actions are required:
- Migrate from IVFFlat to HNSW indexes.
- Upgrade the Supabase instance tier to increase available RAM for index caching.
- Implement Row-Level Security (RLS) if the memory system serves multiple users to ensure data isolation without sacrificing speed.