Vector Stores Guide
A comprehensive guide to choosing, configuring, and using vector stores in HazelJS RAG applications.
Overview
Vector stores are databases optimized for storing and searching high-dimensional vectors (embeddings). They enable semantic search by finding documents similar to a query based on vector similarity rather than keyword matching.
How Vector Stores Work
The Process
- Indexing: Documents are converted to vectors using an embedding model
- Storage: Vectors are stored in the vector database with metadata
- Querying: Search queries are converted to vectors
- Similarity Search: The database finds vectors closest to the query vector
- Results: Documents are returned ranked by similarity score
Choosing a Vector Store
Decision Tree
Quick Recommendations
| Use Case | Recommended Store | Why |
|---|---|---|
| Local Development | Memory or ChromaDB | No setup, fast iteration |
| Prototyping | ChromaDB | Easy setup, persistent |
| Production (Serverless) | Pinecone | Fully managed, auto-scaling |
| Production (High-Performance) | Qdrant | Rust-based, extremely fast |
| Production (GraphQL) | Weaviate | GraphQL API, flexible |
| Cost-Sensitive Production | Qdrant or Weaviate | Open-source, self-hosted |
| Small Datasets (under 10K docs) | Any | All perform well |
| Large Datasets (>1M docs) | Pinecone, Qdrant, Weaviate | Built for scale |
Memory Vector Store
Overview
In-memory vector storage with no external dependencies. Perfect for development and testing.
When to Use
✅ Good For:
- Local development
- Testing and CI/CD
- Prototyping
- Small datasets (under 10,000 documents)
- Learning RAG concepts
❌ Not Good For:
- Production applications
- Large datasets
- Multi-process applications
- Persistent storage needs
Setup
import { MemoryVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';
const embeddings = new OpenAIEmbeddings({
apiKey: process.env.OPENAI_API_KEY,
});
const vectorStore = new MemoryVectorStore(embeddings);
await vectorStore.initialize();
Configuration
No configuration needed! It works out of the box.
Performance Characteristics
- Indexing Speed: ⭐⭐⭐⭐⭐ (Very Fast)
- Search Speed: ⭐⭐⭐⭐⭐ (Very Fast for under 10K docs)
- Scalability: ⭐ (Limited to memory)
- Persistence: ❌ (Data lost on restart)
Pinecone Vector Store
Overview
Fully managed, serverless vector database with automatic scaling and global distribution.
When to Use
✅ Good For:
- Production applications
- Serverless deployments
- Global applications
- Teams without DevOps
- Auto-scaling requirements
- Multi-tenancy (namespaces)
❌ Not Good For:
- Budget-constrained projects
- On-premise requirements
- Air-gapped environments
Setup
Step 1: Create Pinecone Account
Sign up at pinecone.io and create an index.
Step 2: Install Client
npm install @pinecone-database/pinecone
Step 3: Create Index
In Pinecone dashboard:
- Index Name:
my-knowledge-base - Dimensions:
1536(for OpenAI text-embedding-3-small) - Metric:
cosine - Environment: Select your region
Step 4: Configure in Code
import { PineconeVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';
const embeddings = new OpenAIEmbeddings({
apiKey: process.env.OPENAI_API_KEY,
model: 'text-embedding-3-small',
dimensions: 1536,
});
const vectorStore = new PineconeVectorStore(embeddings, {
apiKey: process.env.PINECONE_API_KEY,
environment: 'us-east-1-aws', // Your Pinecone environment
indexName: 'my-knowledge-base',
namespace: 'production', // Optional: for multi-tenancy
textKey: 'content', // Optional: custom field name
metadataKey: 'metadata', // Optional: custom metadata field
});
await vectorStore.initialize();
Advanced Features
Namespaces (Multi-Tenancy)
// Separate data by tenant
const tenant1Store = new PineconeVectorStore(embeddings, {
apiKey: process.env.PINECONE_API_KEY,
environment: 'us-east-1-aws',
indexName: 'shared-index',
namespace: 'tenant-1',
});
const tenant2Store = new PineconeVectorStore(embeddings, {
apiKey: process.env.PINECONE_API_KEY,
environment: 'us-east-1-aws',
indexName: 'shared-index',
namespace: 'tenant-2',
});
Metadata Filtering
// Add documents with rich metadata
await vectorStore.addDocuments([
{
content: 'Document content',
metadata: {
category: 'technical',
date: '2024-01-01',
author: 'John Doe',
},
},
]);
// Filter during search
const results = await vectorStore.search('query', {
topK: 5,
filter: {
category: 'technical',
author: 'John Doe',
},
});
Performance Characteristics
- Indexing Speed: ⭐⭐⭐⭐ (Fast, network dependent)
- Search Speed: ⭐⭐⭐⭐⭐ (Very Fast, under 100ms)
- Scalability: ⭐⭐⭐⭐⭐ (Auto-scaling)
- Persistence: ✅ (Fully managed)
Pricing
- Free Tier: 1 index, 100K vectors
- Starter: $70/month for 5M vectors
- Enterprise: Custom pricing
Qdrant Vector Store
Overview
High-performance, Rust-based vector database optimized for speed and efficiency.
When to Use
✅ Good For:
- High-performance requirements
- Self-hosted deployments
- Cost-sensitive production
- Advanced filtering needs
- On-premise deployments
- Large-scale applications
❌ Not Good For:
- Teams without DevOps
- Serverless-only deployments
- Quick prototypes
Setup
Step 1: Install Client
npm install @qdrant/js-client-rest
Step 2: Start Qdrant Server
Using Docker:
docker run -p 6333:6333 qdrant/qdrant
Using Docker Compose:
version: '3.8'
services:
qdrant:
image: qdrant/qdrant:latest
ports:
- "6333:6333"
- "6334:6334"
volumes:
- ./qdrant_storage:/qdrant/storage
Production Deployment:
# With persistence
docker run -p 6333:6333 \
-v $(pwd)/qdrant_storage:/qdrant/storage \
qdrant/qdrant
Step 3: Configure in Code
import { QdrantVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';
const embeddings = new OpenAIEmbeddings({
apiKey: process.env.OPENAI_API_KEY,
});
const vectorStore = new QdrantVectorStore(embeddings, {
url: 'http://localhost:6333',
apiKey: process.env.QDRANT_API_KEY, // Optional for local
collectionName: 'my-collection',
vectorSize: 1536, // Optional: auto-detected from embeddings
});
await vectorStore.initialize();
Advanced Features
Custom Distance Metrics
// Qdrant supports multiple distance metrics
// Configured during collection creation
const vectorStore = new QdrantVectorStore(embeddings, {
url: 'http://localhost:6333',
collectionName: 'my-collection',
// Distance is set to Cosine by default
});
Advanced Filtering
// Complex metadata filtering
const results = await vectorStore.search('query', {
topK: 10,
filter: {
category: 'technical',
date: { $gte: '2024-01-01' },
tags: { $in: ['typescript', 'framework'] },
},
});
Batch Operations
// Efficient batch indexing
const documents = Array.from({ length: 10000 }, (_, i) => ({
content: `Document ${i}`,
metadata: { index: i },
}));
await vectorStore.addDocuments(documents);
// Automatically batched for optimal performance
Performance Characteristics
- Indexing Speed: ⭐⭐⭐⭐⭐ (Very Fast, Rust-based)
- Search Speed: ⭐⭐⭐⭐⭐ (Extremely Fast)
- Scalability: ⭐⭐⭐⭐⭐ (Horizontal scaling)
- Persistence: ✅ (Configurable)
Production Deployment
Kubernetes
apiVersion: apps/v1
kind: Deployment
metadata:
name: qdrant
spec:
replicas: 3
selector:
matchLabels:
app: qdrant
template:
metadata:
labels:
app: qdrant
spec:
containers:
- name: qdrant
image: qdrant/qdrant:latest
ports:
- containerPort: 6333
volumeMounts:
- name: storage
mountPath: /qdrant/storage
volumes:
- name: storage
persistentVolumeClaim:
claimName: qdrant-pvc
Weaviate Vector Store
Overview
Open-source vector database with GraphQL API and advanced semantic search features.
When to Use
✅ Good For:
- GraphQL-first applications
- Complex semantic queries
- Hybrid search requirements
- Flexible schema needs
- Multi-modal search
- Knowledge graphs
❌ Not Good For:
- Simple use cases
- Teams unfamiliar with GraphQL
- Minimal setup requirements
Setup
Step 1: Install Client
npm install weaviate-ts-client
Step 2: Start Weaviate Server
Using Docker:
docker run -p 8080:8080 semitechnologies/weaviate:latest
Using Docker Compose:
version: '3.8'
services:
weaviate:
image: semitechnologies/weaviate:latest
ports:
- "8080:8080"
environment:
QUERY_DEFAULTS_LIMIT: 25
AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
volumes:
- ./weaviate_data:/var/lib/weaviate
Step 3: Configure in Code
import { WeaviateVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';
const embeddings = new OpenAIEmbeddings({
apiKey: process.env.OPENAI_API_KEY,
});
const vectorStore = new WeaviateVectorStore(embeddings, {
scheme: 'http',
host: 'localhost:8080',
apiKey: process.env.WEAVIATE_API_KEY, // Optional for local
className: 'MyDocuments', // Weaviate class name
textKey: 'content',
metadataKeys: ['category', 'author', 'date'],
});
await vectorStore.initialize();
Advanced Features
GraphQL Queries
Weaviate uses GraphQL for querying, which the HazelJS adapter handles automatically.
Hybrid Search
// Weaviate has built-in hybrid search
// Combines vector and keyword search natively
const results = await vectorStore.search('query', {
topK: 10,
// Weaviate automatically uses hybrid search
});
Multi-Modal Search
// Weaviate supports images, text, and more
// (Requires additional Weaviate configuration)
Performance Characteristics
- Indexing Speed: ⭐⭐⭐⭐ (Fast)
- Search Speed: ⭐⭐⭐⭐ (Fast)
- Scalability: ⭐⭐⭐⭐ (Good horizontal scaling)
- Persistence: ✅ (Configurable)
ChromaDB Vector Store
Overview
Lightweight, embedded vector database perfect for local development and prototyping.
When to Use
✅ Good For:
- Local development
- Prototyping
- Small to medium datasets
- Simple deployments
- Learning and experimentation
❌ Not Good For:
- Large-scale production
- High-concurrency applications
- Distributed systems
Setup
Step 1: Install Client
npm install chromadb
Step 2: Start ChromaDB Server
Using Docker:
docker run -p 8000:8000 chromadb/chroma
Using Python (Alternative):
pip install chromadb
chroma run --host 0.0.0.0 --port 8000
Step 3: Configure in Code
import { ChromaVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';
const embeddings = new OpenAIEmbeddings({
apiKey: process.env.OPENAI_API_KEY,
});
const vectorStore = new ChromaVectorStore(embeddings, {
url: 'http://localhost:8000',
collectionName: 'my-collection',
auth: { // Optional
provider: 'token',
credentials: process.env.CHROMA_TOKEN,
},
});
await vectorStore.initialize();
Advanced Features
Collection Statistics
// ChromaDB-specific features
const stats = await vectorStore.getStats();
console.log(`Collection has ${stats.count} documents`);
Peek Documents
// Preview first N documents
const preview = await vectorStore.peek(10);
console.log('First 10 documents:', preview);
Performance Characteristics
- Indexing Speed: ⭐⭐⭐⭐ (Fast)
- Search Speed: ⭐⭐⭐⭐ (Fast for small datasets)
- Scalability: ⭐⭐⭐ (Limited for large datasets)
- Persistence: ✅ (File-based)
Common Operations
All vector stores implement the same interface:
Initialize
await vectorStore.initialize();
Add Documents
const ids = await vectorStore.addDocuments([
{
content: 'Document text',
metadata: { category: 'tech', date: '2024-01-01' },
},
{
content: 'Another document',
metadata: { category: 'business' },
},
]);
Search
const results = await vectorStore.search('search query', {
topK: 5,
minScore: 0.7,
filter: { category: 'tech' },
});
results.forEach(result => {
console.log(`Score: ${result.score}`);
console.log(`Content: ${result.content}`);
console.log(`Metadata:`, result.metadata);
});
Get Document
const document = await vectorStore.getDocument(documentId);
if (document) {
console.log(document.content);
console.log(document.metadata);
}
Update Document
await vectorStore.updateDocument(documentId, {
content: 'Updated content',
metadata: { updated: true },
});
Delete Documents
await vectorStore.deleteDocuments([id1, id2, id3]);
Clear All
await vectorStore.clear();
Performance Optimization
Batch Operations
// Bad: Individual operations
for (const doc of documents) {
await vectorStore.addDocuments([doc]); // Slow!
}
// Good: Batch operation
await vectorStore.addDocuments(documents); // Fast!
Connection Pooling
// For self-hosted databases, use connection pooling
const vectorStore = new QdrantVectorStore(embeddings, {
url: 'http://localhost:6333',
// Connection pooling is handled internally
});
Caching
// Cache embeddings to avoid regeneration
const embeddingCache = new Map();
async function getEmbedding(text: string) {
if (embeddingCache.has(text)) {
return embeddingCache.get(text);
}
const embedding = await embeddings.embed(text);
embeddingCache.set(text, embedding);
return embedding;
}
Monitoring and Debugging
Enable Logging
// Most vector stores support debug logging
process.env.DEBUG = 'qdrant:*';
Track Performance
async function searchWithMetrics(query: string) {
const start = Date.now();
try {
const results = await vectorStore.search(query);
const duration = Date.now() - start;
console.log(`Search completed in ${duration}ms`);
console.log(`Found ${results.length} results`);
return results;
} catch (error) {
console.error('Search failed:', error);
throw error;
}
}
Health Checks
async function checkVectorStoreHealth() {
try {
await vectorStore.initialize();
console.log('✅ Vector store is healthy');
return true;
} catch (error) {
console.error('❌ Vector store is unhealthy:', error);
return false;
}
}
Migration Between Vector Stores
Export from Memory Store
// Export all documents
const allDocs = await memoryStore.getAllDocuments();
// Save to file
fs.writeFileSync('backup.json', JSON.stringify(allDocs));
Import to Production Store
// Load from file
const docs = JSON.parse(fs.readFileSync('backup.json', 'utf-8'));
// Import to Pinecone
await pineconeStore.addDocuments(docs);
Gradual Migration
// Dual-write during migration
async function addDocument(doc: Document) {
await Promise.all([
oldStore.addDocuments([doc]),
newStore.addDocuments([doc]),
]);
}
// Read from new, fallback to old
async function search(query: string) {
try {
return await newStore.search(query);
} catch (error) {
console.warn('New store failed, using old store');
return await oldStore.search(query);
}
}
Troubleshooting
Connection Issues
// Test connection
try {
await vectorStore.initialize();
console.log('✅ Connected');
} catch (error) {
console.error('❌ Connection failed:', error);
// Check: Is the server running?
// Check: Are credentials correct?
// Check: Is the network accessible?
}
Dimension Mismatch
// Error: Vector dimension mismatch
// Solution: Ensure embedding dimensions match index configuration
// OpenAI text-embedding-3-small = 1536 dimensions
const embeddings = new OpenAIEmbeddings({
apiKey: process.env.OPENAI_API_KEY,
model: 'text-embedding-3-small',
dimensions: 1536, // Must match index
});
Slow Search Performance
- Check index size: Large indices need production stores
- Optimize topK: Request fewer results
- Use metadata filtering: Narrow search scope
- Enable caching: Cache frequent queries
- Upgrade hardware: More RAM/CPU for self-hosted
Best Practices
Start Simple, Scale Later
// Development
const devStore = new MemoryVectorStore(embeddings);
// Production
const prodStore = process.env.NODE_ENV === 'production'
? new PineconeVectorStore(embeddings, config)
: new MemoryVectorStore(embeddings);
Use Environment Variables
const vectorStore = new PineconeVectorStore(embeddings, {
apiKey: process.env.PINECONE_API_KEY,
environment: process.env.PINECONE_ENVIRONMENT,
indexName: process.env.PINECONE_INDEX,
});
Implement Retry Logic
async function searchWithRetry(query: string, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await vectorStore.search(query);
} catch (error) {
if (i === maxRetries - 1) throw error;
await new Promise(resolve => setTimeout(resolve, 1000 * (i + 1)));
}
}
}
Monitor Costs
// Track API usage
let embeddingCalls = 0;
let searchCalls = 0;
const wrappedEmbeddings = {
async embed(text: string) {
embeddingCalls++;
return embeddings.embed(text);
},
};
// Log periodically
setInterval(() => {
console.log(`Embeddings: ${embeddingCalls}, Searches: ${searchCalls}`);
}, 60000);
What's Next?
- Learn about RAG Package for complete RAG implementation
- Explore AI Package for LLM integration
- Check out Caching for performance optimization