Vector Stores Guide

A comprehensive guide to choosing, configuring, and using vector stores in HazelJS RAG applications.

Overview

Vector stores are databases optimized for storing and searching high-dimensional vectors (embeddings). They enable semantic search by finding documents similar to a query based on vector similarity rather than keyword matching.

How Vector Stores Work

Loading diagram...

The Process

Indexing: Documents are converted to vectors using an embedding model
Storage: Vectors are stored in the vector database with metadata
Querying: Search queries are converted to vectors
Similarity Search: The database finds vectors closest to the query vector
Results: Documents are returned ranked by similarity score

Choosing a Vector Store

Decision Tree

Loading diagram...

Quick Recommendations

Use Case	Recommended Store	Why
Local Development	Memory or ChromaDB	No setup, fast iteration
Prototyping	ChromaDB	Easy setup, persistent
Production (Serverless)	Pinecone	Fully managed, auto-scaling
Production (High-Performance)	Qdrant	Rust-based, extremely fast
Production (GraphQL)	Weaviate	GraphQL API, flexible
Cost-Sensitive Production	Qdrant or Weaviate	Open-source, self-hosted
Small Datasets (under 10K docs)	Any	All perform well
Large Datasets (>1M docs)	Pinecone, Qdrant, Weaviate	Built for scale

Memory Vector Store

Overview

In-memory vector storage with no external dependencies. Perfect for development and testing.

When to Use

✅ Good For:

Local development
Testing and CI/CD
Prototyping
Small datasets (under 10,000 documents)
Learning RAG concepts

❌ Not Good For:

Production applications
Large datasets
Multi-process applications
Persistent storage needs

Setup

import { MemoryVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';

const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
});

const vectorStore = new MemoryVectorStore(embeddings);
await vectorStore.initialize();

Configuration

No configuration needed! It works out of the box.

Performance Characteristics

Indexing Speed: ⭐⭐⭐⭐⭐ (Very Fast)
Search Speed: ⭐⭐⭐⭐⭐ (Very Fast for under 10K docs)
Scalability: ⭐ (Limited to memory)
Persistence: ❌ (Data lost on restart)

Pinecone Vector Store

Overview

Fully managed, serverless vector database with automatic scaling and global distribution.

When to Use

✅ Good For:

Production applications
Serverless deployments
Global applications
Teams without DevOps
Auto-scaling requirements
Multi-tenancy (namespaces)

❌ Not Good For:

Budget-constrained projects
On-premise requirements
Air-gapped environments

Setup

Step 1: Create Pinecone Account

Step 2: Install Client

npm install @pinecone-database/pinecone

Step 3: Create Index

In Pinecone dashboard:

Index Name: my-knowledge-base
Dimensions: 1536 (for OpenAI text-embedding-3-small)
Metric: cosine
Environment: Select your region

Step 4: Configure in Code

import { PineconeVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';

const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
  model: 'text-embedding-3-small',
  dimensions: 1536,
});

const vectorStore = new PineconeVectorStore(embeddings, {
  apiKey: process.env.PINECONE_API_KEY,
  environment: 'us-east-1-aws', // Your Pinecone environment
  indexName: 'my-knowledge-base',
  namespace: 'production', // Optional: for multi-tenancy
  textKey: 'content', // Optional: custom field name
  metadataKey: 'metadata', // Optional: custom metadata field
});

await vectorStore.initialize();

Advanced Features

Namespaces (Multi-Tenancy)

// Separate data by tenant
const tenant1Store = new PineconeVectorStore(embeddings, {
  apiKey: process.env.PINECONE_API_KEY,
  environment: 'us-east-1-aws',
  indexName: 'shared-index',
  namespace: 'tenant-1',
});

const tenant2Store = new PineconeVectorStore(embeddings, {
  apiKey: process.env.PINECONE_API_KEY,
  environment: 'us-east-1-aws',
  indexName: 'shared-index',
  namespace: 'tenant-2',
});

Metadata Filtering

// Add documents with rich metadata
await vectorStore.addDocuments([
  {
    content: 'Document content',
    metadata: {
      category: 'technical',
      date: '2024-01-01',
      author: 'John Doe',
    },
  },
]);

// Filter during search
const results = await vectorStore.search('query', {
  topK: 5,
  filter: {
    category: 'technical',
    author: 'John Doe',
  },
});

Performance Characteristics

Indexing Speed: ⭐⭐⭐⭐ (Fast, network dependent)
Search Speed: ⭐⭐⭐⭐⭐ (Very Fast, under 100ms)
Scalability: ⭐⭐⭐⭐⭐ (Auto-scaling)
Persistence: ✅ (Fully managed)

Pricing

Free Tier: 1 index, 100K vectors
Starter: $70/month for 5M vectors
Enterprise: Custom pricing

Qdrant Vector Store

Overview

High-performance, Rust-based vector database optimized for speed and efficiency.

When to Use

✅ Good For:

High-performance requirements
Self-hosted deployments
Cost-sensitive production
Advanced filtering needs
On-premise deployments
Large-scale applications

❌ Not Good For:

Teams without DevOps
Serverless-only deployments
Quick prototypes

Setup

Step 1: Install Client

npm install @qdrant/js-client-rest

Step 2: Start Qdrant Server

Using Docker:

docker run -p 6333:6333 qdrant/qdrant

Using Docker Compose:

version: '3.8'
services:
  qdrant:
    image: qdrant/qdrant:latest
    ports:
      - "6333:6333"
      - "6334:6334"
    volumes:
      - ./qdrant_storage:/qdrant/storage

Production Deployment:

# With persistence
docker run -p 6333:6333 \
  -v $(pwd)/qdrant_storage:/qdrant/storage \
  qdrant/qdrant

Step 3: Configure in Code

import { QdrantVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';

const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
});

const vectorStore = new QdrantVectorStore(embeddings, {
  url: 'http://localhost:6333',
  apiKey: process.env.QDRANT_API_KEY, // Optional for local
  collectionName: 'my-collection',
  vectorSize: 1536, // Optional: auto-detected from embeddings
});

await vectorStore.initialize();

Advanced Features

Custom Distance Metrics

// Qdrant supports multiple distance metrics
// Configured during collection creation
const vectorStore = new QdrantVectorStore(embeddings, {
  url: 'http://localhost:6333',
  collectionName: 'my-collection',
  // Distance is set to Cosine by default
});

Advanced Filtering

// Complex metadata filtering
const results = await vectorStore.search('query', {
  topK: 10,
  filter: {
    category: 'technical',
    date: { $gte: '2024-01-01' },
    tags: { $in: ['typescript', 'framework'] },
  },
});

Batch Operations

// Efficient batch indexing
const documents = Array.from({ length: 10000 }, (_, i) => ({
  content: `Document ${i}`,
  metadata: { index: i },
}));

await vectorStore.addDocuments(documents);
// Automatically batched for optimal performance

Performance Characteristics

Indexing Speed: ⭐⭐⭐⭐⭐ (Very Fast, Rust-based)
Search Speed: ⭐⭐⭐⭐⭐ (Extremely Fast)
Scalability: ⭐⭐⭐⭐⭐ (Horizontal scaling)
Persistence: ✅ (Configurable)

Production Deployment

Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: qdrant
spec:
  replicas: 3
  selector:
    matchLabels:
      app: qdrant
  template:
    metadata:
      labels:
        app: qdrant
    spec:
      containers:
      - name: qdrant
        image: qdrant/qdrant:latest
        ports:
        - containerPort: 6333
        volumeMounts:
        - name: storage
          mountPath: /qdrant/storage
      volumes:
      - name: storage
        persistentVolumeClaim:
          claimName: qdrant-pvc

Weaviate Vector Store

Overview

Open-source vector database with GraphQL API and advanced semantic search features.

When to Use

✅ Good For:

GraphQL-first applications
Complex semantic queries
Hybrid search requirements
Flexible schema needs
Multi-modal search
Knowledge graphs

❌ Not Good For:

Simple use cases
Teams unfamiliar with GraphQL
Minimal setup requirements

Setup

Step 1: Install Client

npm install weaviate-ts-client

Step 2: Start Weaviate Server

Using Docker:

docker run -p 8080:8080 semitechnologies/weaviate:latest

Using Docker Compose:

version: '3.8'
services:
  weaviate:
    image: semitechnologies/weaviate:latest
    ports:
      - "8080:8080"
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
    volumes:
      - ./weaviate_data:/var/lib/weaviate

Step 3: Configure in Code

import { WeaviateVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';

const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
});

const vectorStore = new WeaviateVectorStore(embeddings, {
  scheme: 'http',
  host: 'localhost:8080',
  apiKey: process.env.WEAVIATE_API_KEY, // Optional for local
  className: 'MyDocuments', // Weaviate class name
  textKey: 'content',
  metadataKeys: ['category', 'author', 'date'],
});

await vectorStore.initialize();

Advanced Features

GraphQL Queries

Weaviate uses GraphQL for querying, which the HazelJS adapter handles automatically.

Hybrid Search

// Weaviate has built-in hybrid search
// Combines vector and keyword search natively
const results = await vectorStore.search('query', {
  topK: 10,
  // Weaviate automatically uses hybrid search
});

Multi-Modal Search

// Weaviate supports images, text, and more
// (Requires additional Weaviate configuration)

Performance Characteristics

Indexing Speed: ⭐⭐⭐⭐ (Fast)
Search Speed: ⭐⭐⭐⭐ (Fast)
Scalability: ⭐⭐⭐⭐ (Good horizontal scaling)
Persistence: ✅ (Configurable)

ChromaDB Vector Store

Overview

Lightweight, embedded vector database perfect for local development and prototyping.

When to Use

✅ Good For:

Local development
Prototyping
Small to medium datasets
Simple deployments
Learning and experimentation

❌ Not Good For:

Large-scale production
High-concurrency applications
Distributed systems

Setup

Step 1: Install Client

npm install chromadb

Step 2: Start ChromaDB Server

Using Docker:

docker run -p 8000:8000 chromadb/chroma

Using Python (Alternative):

pip install chromadb
chroma run --host 0.0.0.0 --port 8000

Step 3: Configure in Code

import { ChromaVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';

const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
});

const vectorStore = new ChromaVectorStore(embeddings, {
  url: 'http://localhost:8000',
  collectionName: 'my-collection',
  auth: { // Optional
    provider: 'token',
    credentials: process.env.CHROMA_TOKEN,
  },
});

await vectorStore.initialize();

Advanced Features

Collection Statistics

// ChromaDB-specific features
const stats = await vectorStore.getStats();
console.log(`Collection has ${stats.count} documents`);

Peek Documents

// Preview first N documents
const preview = await vectorStore.peek(10);
console.log('First 10 documents:', preview);

Performance Characteristics

Indexing Speed: ⭐⭐⭐⭐ (Fast)
Search Speed: ⭐⭐⭐⭐ (Fast for small datasets)
Scalability: ⭐⭐⭐ (Limited for large datasets)
Persistence: ✅ (File-based)

Common Operations

All vector stores implement the same interface:

Initialize

await vectorStore.initialize();

Add Documents

const ids = await vectorStore.addDocuments([
  {
    content: 'Document text',
    metadata: { category: 'tech', date: '2024-01-01' },
  },
  {
    content: 'Another document',
    metadata: { category: 'business' },
  },
]);

Search

const results = await vectorStore.search('search query', {
  topK: 5,
  minScore: 0.7,
  filter: { category: 'tech' },
});

results.forEach(result => {
  console.log(`Score: ${result.score}`);
  console.log(`Content: ${result.content}`);
  console.log(`Metadata:`, result.metadata);
});

Get Document

const document = await vectorStore.getDocument(documentId);
if (document) {
  console.log(document.content);
  console.log(document.metadata);
}

Update Document

await vectorStore.updateDocument(documentId, {
  content: 'Updated content',
  metadata: { updated: true },
});

Delete Documents

await vectorStore.deleteDocuments([id1, id2, id3]);

Clear All

await vectorStore.clear();

Performance Optimization

Batch Operations

// Bad: Individual operations
for (const doc of documents) {
  await vectorStore.addDocuments([doc]); // Slow!
}

// Good: Batch operation
await vectorStore.addDocuments(documents); // Fast!

Connection Pooling

// For self-hosted databases, use connection pooling
const vectorStore = new QdrantVectorStore(embeddings, {
  url: 'http://localhost:6333',
  // Connection pooling is handled internally
});

Caching

// Cache embeddings to avoid regeneration
const embeddingCache = new Map();

async function getEmbedding(text: string) {
  if (embeddingCache.has(text)) {
    return embeddingCache.get(text);
  }
  
  const embedding = await embeddings.embed(text);
  embeddingCache.set(text, embedding);
  return embedding;
}

Monitoring and Debugging

Enable Logging

// Most vector stores support debug logging
process.env.DEBUG = 'qdrant:*';

Track Performance

async function searchWithMetrics(query: string) {
  const start = Date.now();
  
  try {
    const results = await vectorStore.search(query);
    const duration = Date.now() - start;
    
    console.log(`Search completed in ${duration}ms`);
    console.log(`Found ${results.length} results`);
    
    return results;
  } catch (error) {
    console.error('Search failed:', error);
    throw error;
  }
}

Health Checks

async function checkVectorStoreHealth() {
  try {
    await vectorStore.initialize();
    console.log('✅ Vector store is healthy');
    return true;
  } catch (error) {
    console.error('❌ Vector store is unhealthy:', error);
    return false;
  }
}

Migration Between Vector Stores

Export from Memory Store

// Export all documents
const allDocs = await memoryStore.getAllDocuments();

// Save to file
fs.writeFileSync('backup.json', JSON.stringify(allDocs));

Import to Production Store

// Load from file
const docs = JSON.parse(fs.readFileSync('backup.json', 'utf-8'));

// Import to Pinecone
await pineconeStore.addDocuments(docs);

Gradual Migration

// Dual-write during migration
async function addDocument(doc: Document) {
  await Promise.all([
    oldStore.addDocuments([doc]),
    newStore.addDocuments([doc]),
  ]);
}

// Read from new, fallback to old
async function search(query: string) {
  try {
    return await newStore.search(query);
  } catch (error) {
    console.warn('New store failed, using old store');
    return await oldStore.search(query);
  }
}

Troubleshooting

Connection Issues

// Test connection
try {
  await vectorStore.initialize();
  console.log('✅ Connected');
} catch (error) {
  console.error('❌ Connection failed:', error);
  // Check: Is the server running?
  // Check: Are credentials correct?
  // Check: Is the network accessible?
}

Dimension Mismatch

// Error: Vector dimension mismatch
// Solution: Ensure embedding dimensions match index configuration

// OpenAI text-embedding-3-small = 1536 dimensions
const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
  model: 'text-embedding-3-small',
  dimensions: 1536, // Must match index
});

Slow Search Performance

Check index size: Large indices need production stores
Optimize topK: Request fewer results
Use metadata filtering: Narrow search scope
Enable caching: Cache frequent queries
Upgrade hardware: More RAM/CPU for self-hosted

Best Practices

Start Simple, Scale Later

// Development
const devStore = new MemoryVectorStore(embeddings);

// Production
const prodStore = process.env.NODE_ENV === 'production'
  ? new PineconeVectorStore(embeddings, config)
  : new MemoryVectorStore(embeddings);

Use Environment Variables

const vectorStore = new PineconeVectorStore(embeddings, {
  apiKey: process.env.PINECONE_API_KEY,
  environment: process.env.PINECONE_ENVIRONMENT,
  indexName: process.env.PINECONE_INDEX,
});

Implement Retry Logic

async function searchWithRetry(query: string, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await vectorStore.search(query);
    } catch (error) {
      if (i === maxRetries - 1) throw error;
      await new Promise(resolve => setTimeout(resolve, 1000 * (i + 1)));
    }
  }
}

Monitor Costs

// Track API usage
let embeddingCalls = 0;
let searchCalls = 0;

const wrappedEmbeddings = {
  async embed(text: string) {
    embeddingCalls++;
    return embeddings.embed(text);
  },
};

// Log periodically
setInterval(() => {
  console.log(`Embeddings: ${embeddingCalls}, Searches: ${searchCalls}`);
}, 60000);

What's Next?

Learn about RAG Package for complete RAG implementation
Explore AI Package for LLM integration
Check out Caching for performance optimization