Vector Stores Guide

A comprehensive guide to choosing, configuring, and using vector stores in HazelJS RAG applications.

Overview

Vector stores are databases optimized for storing and searching high-dimensional vectors (embeddings). They enable semantic search by finding documents similar to a query based on vector similarity rather than keyword matching.

How Vector Stores Work

Loading diagram...

The Process

  • Indexing: Documents are converted to vectors using an embedding model
  • Storage: Vectors are stored in the vector database with metadata
  • Querying: Search queries are converted to vectors
  • Similarity Search: The database finds vectors closest to the query vector
  • Results: Documents are returned ranked by similarity score

Choosing a Vector Store

Decision Tree

Loading diagram...

Quick Recommendations

Use CaseRecommended StoreWhy
Local DevelopmentMemory or ChromaDBNo setup, fast iteration
PrototypingChromaDBEasy setup, persistent
Production (Serverless)PineconeFully managed, auto-scaling
Production (High-Performance)QdrantRust-based, extremely fast
Production (GraphQL)WeaviateGraphQL API, flexible
Cost-Sensitive ProductionQdrant or WeaviateOpen-source, self-hosted
Small Datasets (under 10K docs)AnyAll perform well
Large Datasets (>1M docs)Pinecone, Qdrant, WeaviateBuilt for scale

Memory Vector Store

Overview

In-memory vector storage with no external dependencies. Perfect for development and testing.

When to Use

Good For:

  • Local development
  • Testing and CI/CD
  • Prototyping
  • Small datasets (under 10,000 documents)
  • Learning RAG concepts

Not Good For:

  • Production applications
  • Large datasets
  • Multi-process applications
  • Persistent storage needs

Setup

import { MemoryVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';

const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
});

const vectorStore = new MemoryVectorStore(embeddings);
await vectorStore.initialize();

Configuration

No configuration needed! It works out of the box.

Performance Characteristics

  • Indexing Speed: ⭐⭐⭐⭐⭐ (Very Fast)
  • Search Speed: ⭐⭐⭐⭐⭐ (Very Fast for under 10K docs)
  • Scalability: ⭐ (Limited to memory)
  • Persistence: ❌ (Data lost on restart)

Pinecone Vector Store

Overview

Fully managed, serverless vector database with automatic scaling and global distribution.

When to Use

Good For:

  • Production applications
  • Serverless deployments
  • Global applications
  • Teams without DevOps
  • Auto-scaling requirements
  • Multi-tenancy (namespaces)

Not Good For:

  • Budget-constrained projects
  • On-premise requirements
  • Air-gapped environments

Setup

Step 1: Create Pinecone Account

Sign up at pinecone.io and create an index.

Step 2: Install Client

npm install @pinecone-database/pinecone

Step 3: Create Index

In Pinecone dashboard:

  • Index Name: my-knowledge-base
  • Dimensions: 1536 (for OpenAI text-embedding-3-small)
  • Metric: cosine
  • Environment: Select your region

Step 4: Configure in Code

import { PineconeVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';

const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
  model: 'text-embedding-3-small',
  dimensions: 1536,
});

const vectorStore = new PineconeVectorStore(embeddings, {
  apiKey: process.env.PINECONE_API_KEY,
  environment: 'us-east-1-aws', // Your Pinecone environment
  indexName: 'my-knowledge-base',
  namespace: 'production', // Optional: for multi-tenancy
  textKey: 'content', // Optional: custom field name
  metadataKey: 'metadata', // Optional: custom metadata field
});

await vectorStore.initialize();

Advanced Features

Namespaces (Multi-Tenancy)

// Separate data by tenant
const tenant1Store = new PineconeVectorStore(embeddings, {
  apiKey: process.env.PINECONE_API_KEY,
  environment: 'us-east-1-aws',
  indexName: 'shared-index',
  namespace: 'tenant-1',
});

const tenant2Store = new PineconeVectorStore(embeddings, {
  apiKey: process.env.PINECONE_API_KEY,
  environment: 'us-east-1-aws',
  indexName: 'shared-index',
  namespace: 'tenant-2',
});

Metadata Filtering

// Add documents with rich metadata
await vectorStore.addDocuments([
  {
    content: 'Document content',
    metadata: {
      category: 'technical',
      date: '2024-01-01',
      author: 'John Doe',
    },
  },
]);

// Filter during search
const results = await vectorStore.search('query', {
  topK: 5,
  filter: {
    category: 'technical',
    author: 'John Doe',
  },
});

Performance Characteristics

  • Indexing Speed: ⭐⭐⭐⭐ (Fast, network dependent)
  • Search Speed: ⭐⭐⭐⭐⭐ (Very Fast, under 100ms)
  • Scalability: ⭐⭐⭐⭐⭐ (Auto-scaling)
  • Persistence: ✅ (Fully managed)

Pricing

  • Free Tier: 1 index, 100K vectors
  • Starter: $70/month for 5M vectors
  • Enterprise: Custom pricing

Qdrant Vector Store

Overview

High-performance, Rust-based vector database optimized for speed and efficiency.

When to Use

Good For:

  • High-performance requirements
  • Self-hosted deployments
  • Cost-sensitive production
  • Advanced filtering needs
  • On-premise deployments
  • Large-scale applications

Not Good For:

  • Teams without DevOps
  • Serverless-only deployments
  • Quick prototypes

Setup

Step 1: Install Client

npm install @qdrant/js-client-rest

Step 2: Start Qdrant Server

Using Docker:

docker run -p 6333:6333 qdrant/qdrant

Using Docker Compose:

version: '3.8'
services:
  qdrant:
    image: qdrant/qdrant:latest
    ports:
      - "6333:6333"
      - "6334:6334"
    volumes:
      - ./qdrant_storage:/qdrant/storage

Production Deployment:

# With persistence
docker run -p 6333:6333 \
  -v $(pwd)/qdrant_storage:/qdrant/storage \
  qdrant/qdrant

Step 3: Configure in Code

import { QdrantVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';

const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
});

const vectorStore = new QdrantVectorStore(embeddings, {
  url: 'http://localhost:6333',
  apiKey: process.env.QDRANT_API_KEY, // Optional for local
  collectionName: 'my-collection',
  vectorSize: 1536, // Optional: auto-detected from embeddings
});

await vectorStore.initialize();

Advanced Features

Custom Distance Metrics

// Qdrant supports multiple distance metrics
// Configured during collection creation
const vectorStore = new QdrantVectorStore(embeddings, {
  url: 'http://localhost:6333',
  collectionName: 'my-collection',
  // Distance is set to Cosine by default
});

Advanced Filtering

// Complex metadata filtering
const results = await vectorStore.search('query', {
  topK: 10,
  filter: {
    category: 'technical',
    date: { $gte: '2024-01-01' },
    tags: { $in: ['typescript', 'framework'] },
  },
});

Batch Operations

// Efficient batch indexing
const documents = Array.from({ length: 10000 }, (_, i) => ({
  content: `Document ${i}`,
  metadata: { index: i },
}));

await vectorStore.addDocuments(documents);
// Automatically batched for optimal performance

Performance Characteristics

  • Indexing Speed: ⭐⭐⭐⭐⭐ (Very Fast, Rust-based)
  • Search Speed: ⭐⭐⭐⭐⭐ (Extremely Fast)
  • Scalability: ⭐⭐⭐⭐⭐ (Horizontal scaling)
  • Persistence: ✅ (Configurable)

Production Deployment

Kubernetes

apiVersion: apps/v1
kind: Deployment
metadata:
  name: qdrant
spec:
  replicas: 3
  selector:
    matchLabels:
      app: qdrant
  template:
    metadata:
      labels:
        app: qdrant
    spec:
      containers:
      - name: qdrant
        image: qdrant/qdrant:latest
        ports:
        - containerPort: 6333
        volumeMounts:
        - name: storage
          mountPath: /qdrant/storage
      volumes:
      - name: storage
        persistentVolumeClaim:
          claimName: qdrant-pvc

Weaviate Vector Store

Overview

Open-source vector database with GraphQL API and advanced semantic search features.

When to Use

Good For:

  • GraphQL-first applications
  • Complex semantic queries
  • Hybrid search requirements
  • Flexible schema needs
  • Multi-modal search
  • Knowledge graphs

Not Good For:

  • Simple use cases
  • Teams unfamiliar with GraphQL
  • Minimal setup requirements

Setup

Step 1: Install Client

npm install weaviate-ts-client

Step 2: Start Weaviate Server

Using Docker:

docker run -p 8080:8080 semitechnologies/weaviate:latest

Using Docker Compose:

version: '3.8'
services:
  weaviate:
    image: semitechnologies/weaviate:latest
    ports:
      - "8080:8080"
    environment:
      QUERY_DEFAULTS_LIMIT: 25
      AUTHENTICATION_ANONYMOUS_ACCESS_ENABLED: 'true'
      PERSISTENCE_DATA_PATH: '/var/lib/weaviate'
    volumes:
      - ./weaviate_data:/var/lib/weaviate

Step 3: Configure in Code

import { WeaviateVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';

const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
});

const vectorStore = new WeaviateVectorStore(embeddings, {
  scheme: 'http',
  host: 'localhost:8080',
  apiKey: process.env.WEAVIATE_API_KEY, // Optional for local
  className: 'MyDocuments', // Weaviate class name
  textKey: 'content',
  metadataKeys: ['category', 'author', 'date'],
});

await vectorStore.initialize();

Advanced Features

GraphQL Queries

Weaviate uses GraphQL for querying, which the HazelJS adapter handles automatically.

Hybrid Search

// Weaviate has built-in hybrid search
// Combines vector and keyword search natively
const results = await vectorStore.search('query', {
  topK: 10,
  // Weaviate automatically uses hybrid search
});

Multi-Modal Search

// Weaviate supports images, text, and more
// (Requires additional Weaviate configuration)

Performance Characteristics

  • Indexing Speed: ⭐⭐⭐⭐ (Fast)
  • Search Speed: ⭐⭐⭐⭐ (Fast)
  • Scalability: ⭐⭐⭐⭐ (Good horizontal scaling)
  • Persistence: ✅ (Configurable)

ChromaDB Vector Store

Overview

Lightweight, embedded vector database perfect for local development and prototyping.

When to Use

Good For:

  • Local development
  • Prototyping
  • Small to medium datasets
  • Simple deployments
  • Learning and experimentation

Not Good For:

  • Large-scale production
  • High-concurrency applications
  • Distributed systems

Setup

Step 1: Install Client

npm install chromadb

Step 2: Start ChromaDB Server

Using Docker:

docker run -p 8000:8000 chromadb/chroma

Using Python (Alternative):

pip install chromadb
chroma run --host 0.0.0.0 --port 8000

Step 3: Configure in Code

import { ChromaVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';

const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
});

const vectorStore = new ChromaVectorStore(embeddings, {
  url: 'http://localhost:8000',
  collectionName: 'my-collection',
  auth: { // Optional
    provider: 'token',
    credentials: process.env.CHROMA_TOKEN,
  },
});

await vectorStore.initialize();

Advanced Features

Collection Statistics

// ChromaDB-specific features
const stats = await vectorStore.getStats();
console.log(`Collection has ${stats.count} documents`);

Peek Documents

// Preview first N documents
const preview = await vectorStore.peek(10);
console.log('First 10 documents:', preview);

Performance Characteristics

  • Indexing Speed: ⭐⭐⭐⭐ (Fast)
  • Search Speed: ⭐⭐⭐⭐ (Fast for small datasets)
  • Scalability: ⭐⭐⭐ (Limited for large datasets)
  • Persistence: ✅ (File-based)

Common Operations

All vector stores implement the same interface:

Initialize

await vectorStore.initialize();

Add Documents

const ids = await vectorStore.addDocuments([
  {
    content: 'Document text',
    metadata: { category: 'tech', date: '2024-01-01' },
  },
  {
    content: 'Another document',
    metadata: { category: 'business' },
  },
]);

Search

const results = await vectorStore.search('search query', {
  topK: 5,
  minScore: 0.7,
  filter: { category: 'tech' },
});

results.forEach(result => {
  console.log(`Score: ${result.score}`);
  console.log(`Content: ${result.content}`);
  console.log(`Metadata:`, result.metadata);
});

Get Document

const document = await vectorStore.getDocument(documentId);
if (document) {
  console.log(document.content);
  console.log(document.metadata);
}

Update Document

await vectorStore.updateDocument(documentId, {
  content: 'Updated content',
  metadata: { updated: true },
});

Delete Documents

await vectorStore.deleteDocuments([id1, id2, id3]);

Clear All

await vectorStore.clear();

Performance Optimization

Batch Operations

// Bad: Individual operations
for (const doc of documents) {
  await vectorStore.addDocuments([doc]); // Slow!
}

// Good: Batch operation
await vectorStore.addDocuments(documents); // Fast!

Connection Pooling

// For self-hosted databases, use connection pooling
const vectorStore = new QdrantVectorStore(embeddings, {
  url: 'http://localhost:6333',
  // Connection pooling is handled internally
});

Caching

// Cache embeddings to avoid regeneration
const embeddingCache = new Map();

async function getEmbedding(text: string) {
  if (embeddingCache.has(text)) {
    return embeddingCache.get(text);
  }
  
  const embedding = await embeddings.embed(text);
  embeddingCache.set(text, embedding);
  return embedding;
}

Monitoring and Debugging

Enable Logging

// Most vector stores support debug logging
process.env.DEBUG = 'qdrant:*';

Track Performance

async function searchWithMetrics(query: string) {
  const start = Date.now();
  
  try {
    const results = await vectorStore.search(query);
    const duration = Date.now() - start;
    
    console.log(`Search completed in ${duration}ms`);
    console.log(`Found ${results.length} results`);
    
    return results;
  } catch (error) {
    console.error('Search failed:', error);
    throw error;
  }
}

Health Checks

async function checkVectorStoreHealth() {
  try {
    await vectorStore.initialize();
    console.log('✅ Vector store is healthy');
    return true;
  } catch (error) {
    console.error('❌ Vector store is unhealthy:', error);
    return false;
  }
}

Migration Between Vector Stores

Export from Memory Store

// Export all documents
const allDocs = await memoryStore.getAllDocuments();

// Save to file
fs.writeFileSync('backup.json', JSON.stringify(allDocs));

Import to Production Store

// Load from file
const docs = JSON.parse(fs.readFileSync('backup.json', 'utf-8'));

// Import to Pinecone
await pineconeStore.addDocuments(docs);

Gradual Migration

// Dual-write during migration
async function addDocument(doc: Document) {
  await Promise.all([
    oldStore.addDocuments([doc]),
    newStore.addDocuments([doc]),
  ]);
}

// Read from new, fallback to old
async function search(query: string) {
  try {
    return await newStore.search(query);
  } catch (error) {
    console.warn('New store failed, using old store');
    return await oldStore.search(query);
  }
}

Troubleshooting

Connection Issues

// Test connection
try {
  await vectorStore.initialize();
  console.log('✅ Connected');
} catch (error) {
  console.error('❌ Connection failed:', error);
  // Check: Is the server running?
  // Check: Are credentials correct?
  // Check: Is the network accessible?
}

Dimension Mismatch

// Error: Vector dimension mismatch
// Solution: Ensure embedding dimensions match index configuration

// OpenAI text-embedding-3-small = 1536 dimensions
const embeddings = new OpenAIEmbeddings({
  apiKey: process.env.OPENAI_API_KEY,
  model: 'text-embedding-3-small',
  dimensions: 1536, // Must match index
});

Slow Search Performance

  • Check index size: Large indices need production stores
  • Optimize topK: Request fewer results
  • Use metadata filtering: Narrow search scope
  • Enable caching: Cache frequent queries
  • Upgrade hardware: More RAM/CPU for self-hosted

Best Practices

Start Simple, Scale Later

// Development
const devStore = new MemoryVectorStore(embeddings);

// Production
const prodStore = process.env.NODE_ENV === 'production'
  ? new PineconeVectorStore(embeddings, config)
  : new MemoryVectorStore(embeddings);

Use Environment Variables

const vectorStore = new PineconeVectorStore(embeddings, {
  apiKey: process.env.PINECONE_API_KEY,
  environment: process.env.PINECONE_ENVIRONMENT,
  indexName: process.env.PINECONE_INDEX,
});

Implement Retry Logic

async function searchWithRetry(query: string, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await vectorStore.search(query);
    } catch (error) {
      if (i === maxRetries - 1) throw error;
      await new Promise(resolve => setTimeout(resolve, 1000 * (i + 1)));
    }
  }
}

Monitor Costs

// Track API usage
let embeddingCalls = 0;
let searchCalls = 0;

const wrappedEmbeddings = {
  async embed(text: string) {
    embeddingCalls++;
    return embeddings.embed(text);
  },
};

// Log periodically
setInterval(() => {
  console.log(`Embeddings: ${embeddingCalls}, Searches: ${searchCalls}`);
}, 60000);

What's Next?

  • Learn about RAG Package for complete RAG implementation
  • Explore AI Package for LLM integration
  • Check out Caching for performance optimization