Memory System
The Memory System in @hazeljs/rag provides persistent context and conversation management, enabling AI applications to remember conversations, user preferences, and historical interactions across sessions.
Overview
Building context-aware AI applications requires managing conversation history, tracking entities, storing facts, and maintaining temporal context. The Memory System solves these challenges by providing:
- 5 Memory Types: Conversation, Entity, Fact, Event, and Working memory
- 3 Storage Strategies: BufferMemory (fast), VectorMemory (semantic), HybridMemory (best of both)
- Semantic Search: Find relevant memories using embeddings
- Auto-Summarization: Compress old conversations automatically
- Entity Tracking: Remember people, companies, and relationships
- Importance Scoring: Prioritize relevant information
- RAG Integration: Combine document retrieval with conversation context
Architecture
Memory Types
Conversation Memory
Track multi-turn conversations with automatic summarization.
import { MemoryManager, BufferMemory } from '@hazeljs/rag';
const memoryStore = new BufferMemory({ maxSize: 100 });
const memoryManager = new MemoryManager(memoryStore, {
maxConversationLength: 20,
summarizeAfter: 50,
});
await memoryManager.initialize();
// Add messages
await memoryManager.addMessage(
{ role: 'user', content: 'What is HazelJS?' },
'session-123'
);
await memoryManager.addMessage(
{ role: 'assistant', content: 'HazelJS is an AI-native framework...' },
'session-123'
);
// Get history
const history = await memoryManager.getConversationHistory('session-123', 10);
// Summarize
const summary = await memoryManager.summarizeConversation('session-123');
Features:
- Sliding window for recent messages
- Automatic summarization of old conversations
- Token-aware context management
- Multi-session support
Entity Memory
Track entities (people, companies, concepts) mentioned in conversations.
// Track an entity
await memoryManager.trackEntity({
name: 'Alice',
type: 'person',
attributes: {
role: 'engineer',
company: 'TechCorp',
},
relationships: [
{ type: 'works_at', target: 'TechCorp' },
],
firstSeen: new Date(),
lastSeen: new Date(),
mentions: 1,
});
// Retrieve entity
const alice = await memoryManager.getEntity('Alice');
// Update entity
await memoryManager.updateEntity('Alice', {
attributes: { ...alice.attributes, status: 'premium' },
});
// Get all entities
const entities = await memoryManager.getAllEntities('session-123');
Use Cases:
- Customer relationship management
- Personalized recommendations
- Knowledge graph construction
- Context-aware responses
Semantic Memory (Facts)
Store and recall facts with semantic understanding.
// Store facts
await memoryManager.storeFact(
'User prefers dark mode',
{ userId: 'user-123', category: 'preference' }
);
await memoryManager.storeFact(
'HazelJS supports TypeScript decorators',
{ category: 'framework-feature' }
);
// Recall facts semantically
const facts = await memoryManager.recallFacts('user preferences', {
topK: 5,
minScore: 0.7,
});
// Update a fact
await memoryManager.updateFact(factId, 'User prefers light mode');
Features:
- Semantic search across facts
- Time-based relevance
- Conflict detection
- Automatic consolidation
Working Memory
Temporary scratchpad for current task context.
// Set context
await memoryManager.setContext('current_task', 'checkout', 'session-123');
await memoryManager.setContext('cart_items', ['item1', 'item2'], 'session-123');
// Get context
const task = await memoryManager.getContext('current_task', 'session-123');
const items = await memoryManager.getContext('cart_items', 'session-123');
// Clear context
await memoryManager.clearContext('session-123');
Use Cases:
- Multi-step workflows
- State management
- Temporary calculations
- Task coordination
Storage Strategies
BufferMemory
Fast FIFO in-memory buffer for recent memories.
import { BufferMemory } from '@hazeljs/rag';
const buffer = new BufferMemory({
maxSize: 100,
ttl: 3600000, // 1 hour in milliseconds
});
Best For:
- Development and testing
- Recent conversation history
- Low-latency requirements
- Temporary context
Advantages:
- Extremely fast (in-memory)
- Zero setup
- No external dependencies
- Automatic TTL expiration
Limitations:
- Data lost on restart
- Limited capacity
- No semantic search
VectorMemory
Stores memories as embeddings for semantic search.
import { VectorMemory, MemoryVectorStore, OpenAIEmbeddings } from '@hazeljs/rag';
const embeddings = new OpenAIEmbeddings({
apiKey: process.env.OPENAI_API_KEY,
});
const vectorStore = new MemoryVectorStore(embeddings);
const vectorMemory = new VectorMemory(vectorStore, embeddings, {
collectionName: 'memories',
});
Best For:
- Long-term memory storage
- Semantic search requirements
- Production deployments
- Large memory volumes
Advantages:
- Semantic search
- Persistent storage
- Scalable
- Works with any vector store
Limitations:
- Slower than buffer
- Requires embeddings
- External dependencies
HybridMemory
Combines buffer and vector storage for optimal performance.
import { HybridMemory, BufferMemory, VectorMemory } from '@hazeljs/rag';
const buffer = new BufferMemory({ maxSize: 20 });
const vectorMemory = new VectorMemory(vectorStore, embeddings);
const hybrid = new HybridMemory(buffer, vectorMemory, {
archiveThreshold: 15, // Archive after 15 messages
});
Best For:
- Production applications
- Balancing speed and persistence
- Large-scale deployments
- Best of both worlds
How It Works:
- Recent memories stay in fast buffer
- Old memories automatically archive to vector store
- Searches check both stores
- Deduplication ensures consistency
RAG Integration
Combine memory with document retrieval for context-aware responses.
import {
RAGPipelineWithMemory,
MemoryManager,
HybridMemory,
BufferMemory,
VectorMemory,
MemoryVectorStore,
OpenAIEmbeddings,
} from '@hazeljs/rag';
// Setup memory
const embeddings = new OpenAIEmbeddings({
apiKey: process.env.OPENAI_API_KEY,
});
const buffer = new BufferMemory({ maxSize: 20 });
const memoryVectorStore = new MemoryVectorStore(embeddings);
const vectorMemory = new VectorMemory(memoryVectorStore, embeddings);
const hybridMemory = new HybridMemory(buffer, vectorMemory);
const memoryManager = new MemoryManager(hybridMemory, {
maxConversationLength: 20,
summarizeAfter: 50,
entityExtraction: true,
});
// Setup RAG
const documentVectorStore = new MemoryVectorStore(embeddings);
const rag = new RAGPipelineWithMemory(
{
vectorStore: documentVectorStore,
embeddingProvider: embeddings,
topK: 5,
},
memoryManager,
llmFunction
);
await rag.initialize();
// Add documents
await rag.addDocuments([
{
content: 'HazelJS is a modern TypeScript framework...',
metadata: { source: 'docs' },
},
]);
// Query with memory context
const response = await rag.queryWithMemory(
'What did we discuss about pricing?',
'session-123',
'user-456'
);
console.log(response.answer);
console.log('Sources:', response.sources);
console.log('Memories:', response.memories);
console.log('History:', response.conversationHistory);
Enhanced Context
The RAG pipeline with memory combines three sources of context:
- Document Retrieval: Relevant documents from knowledge base
- Conversation History: Recent messages in the conversation
- Relevant Memories: Semantically similar past interactions
// Automatic fact extraction
const response = await rag.queryWithLearning(
'Tell me about HazelJS features',
'session-123',
'user-456'
);
// Facts from response are automatically stored
// Get conversation summary
const summary = await rag.getConversationSummary('session-123');
// Recall specific facts
const facts = await rag.recallFacts('user preferences', 5);
// Memory statistics
const stats = await rag.getMemoryStats('session-123');
Advanced Features
Memory Search
Search across all memories semantically:
const relevantMemories = await memoryManager.relevantMemories(
'pricing and discounts',
{
sessionId: 'session-123',
types: [MemoryType.CONVERSATION, MemoryType.FACT],
topK: 5,
minScore: 0.7,
}
);
Importance Scoring
Automatically calculate and use importance scores:
const memoryManager = new MemoryManager(memoryStore, {
importanceScoring: true, // Enable automatic scoring
});
// Memories with higher importance are retained longer
// Questions and long content get higher scores
Memory Decay
Time-based relevance scoring:
const memoryManager = new MemoryManager(memoryStore, {
memoryDecay: true,
decayRate: 0.1, // 10% decay per time unit
});
// Older memories gradually become less relevant
Memory Statistics
Monitor memory usage:
const stats = await memoryManager.getStats('session-123');
console.log(`Total memories: ${stats.totalMemories}`);
console.log(`By type:`, stats.byType);
console.log(`Average importance: ${stats.averageImportance}`);
console.log(`Oldest: ${stats.oldestMemory}`);
console.log(`Newest: ${stats.newestMemory}`);
Memory Pruning
Clean up old or low-importance memories:
// Prune memories older than 30 days
const pruned = await memoryManager.prune({
olderThan: new Date(Date.now() - 30 * 24 * 60 * 60 * 1000),
});
// Prune low-importance memories
const pruned = await memoryManager.prune({
minImportance: 0.5,
});
Configuration
Memory Manager Config
const config = {
maxConversationLength: 20, // Max messages in buffer
summarizeAfter: 50, // Summarize after N messages
entityExtraction: true, // Auto-extract entities
importanceScoring: true, // Calculate importance scores
memoryDecay: false, // Enable time-based decay
decayRate: 0.1, // Decay rate (if enabled)
maxWorkingMemorySize: 10, // Max working memory items
};
const memoryManager = new MemoryManager(memoryStore, config);
Buffer Memory Config
const bufferConfig = {
maxSize: 100, // Max memories in buffer
ttl: 3600000, // Time to live (ms)
};
const buffer = new BufferMemory(bufferConfig);
Hybrid Memory Config
const hybridConfig = {
bufferSize: 20, // Buffer size
archiveThreshold: 15, // Archive after N messages
ttl: 3600000, // Buffer TTL
};
const hybrid = new HybridMemory(buffer, vectorMemory, hybridConfig);
Use Cases
Customer Support Bot
// Remember customer information
await memoryManager.trackEntity({
name: 'Jane Smith',
type: 'customer',
attributes: { tier: 'premium', accountId: 'ACC-123' },
// ...
});
// Store support history
await memoryManager.storeFact(
'Customer reported login issues on 2024-01-15',
{ customerId: 'ACC-123', category: 'support' }
);
// Context-aware responses
const response = await rag.queryWithMemory(
'What was my previous issue?',
'session-123',
'ACC-123'
);
Personal AI Assistant
// Remember preferences
await memoryManager.storeFact('User prefers concise responses');
await memoryManager.storeFact('User timezone is PST');
// Track tasks
await memoryManager.setContext('active_tasks', ['email', 'meeting'], 'session-123');
// Personalized responses
const response = await rag.queryWithMemory(
'What should I focus on today?',
'session-123'
);
Educational Tutor
// Track learning progress
await memoryManager.trackEntity({
name: 'Student-123',
type: 'student',
attributes: {
level: 'intermediate',
completedLessons: ['intro', 'basics'],
},
// ...
});
// Remember misconceptions
await memoryManager.storeFact(
'Student confused about async/await',
{ studentId: 'Student-123', topic: 'javascript' }
);
Best Practices
Choose the Right Store
- Development: Use
BufferMemoryfor fast iteration - Production: Use
HybridMemoryfor best performance - Semantic Search: Use
VectorMemorywhen search is critical
Set Appropriate Limits
- Configure
maxConversationLengthbased on LLM token limits - Set
archiveThresholdto balance performance and memory - Use
summarizeAfterto compress long conversations
Enable Features Selectively
entityExtraction: For tracking people and thingsimportanceScoring: For prioritizationmemoryDecay: For time-based relevance
Monitor Memory Usage
// Regular monitoring
const stats = await memoryManager.getStats();
console.log(`Memory usage: ${stats.totalMemories}`);
// Periodic pruning
setInterval(async () => {
await memoryManager.prune({ olderThan: thirtyDaysAgo });
}, 24 * 60 * 60 * 1000); // Daily
Session Management
// Use consistent session IDs
const sessionId = `user-${userId}-${Date.now()}`;
// Clear sessions when done
await memoryManager.clearConversation(sessionId);
await memoryManager.clearContext(sessionId);
Examples
Check out the memory examples for complete working code:
- Basic Memory: Core features and memory types
- RAG with Memory: Integration with document retrieval
- Chatbot with Memory: Complete context-aware chatbot
API Reference
MemoryManager
class MemoryManager {
// Conversation
addMessage(message: Message, sessionId: string): Promise<string>
getConversationHistory(sessionId: string, limit?: number): Promise<Message[]>
summarizeConversation(sessionId: string): Promise<string>
clearConversation(sessionId: string): Promise<void>
// Entity
trackEntity(entity: Entity): Promise<void>
getEntity(name: string): Promise<Entity | null>
updateEntity(name: string, updates: Partial<Entity>): Promise<void>
getAllEntities(sessionId?: string): Promise<Entity[]>
// Facts
storeFact(fact: string, metadata?: Record<string, any>): Promise<string>
recallFacts(query: string, options?: MemorySearchOptions): Promise<string[]>
updateFact(id: string, newContent: string): Promise<void>
// Working Memory
setContext(key: string, value: any, sessionId: string): Promise<void>
getContext(key: string, sessionId: string): Promise<any>
clearContext(sessionId: string): Promise<void>
// Search & Stats
relevantMemories(query: string, options: MemorySearchOptions): Promise<Memory[]>
getStats(sessionId?: string): Promise<MemoryStats>
}
RAGPipelineWithMemory
class RAGPipelineWithMemory extends RAGPipeline {
queryWithMemory(
query: string,
sessionId: string,
userId?: string,
options?: RAGQueryOptions
): Promise<RAGResponseWithMemory>
queryWithLearning(
query: string,
sessionId: string,
userId?: string,
options?: RAGQueryOptions
): Promise<RAGResponseWithMemory>
clearSessionMemory(sessionId: string): Promise<void>
getConversationSummary(sessionId: string): Promise<string>
storeFact(fact: string, sessionId?: string, userId?: string): Promise<string>
recallFacts(query: string, topK?: number): Promise<string[]>
getMemoryStats(sessionId?: string): Promise<MemoryStats>
}
Next Steps
- Explore RAG Patterns for advanced retrieval strategies
- Check out Vector Stores for storage options
- See the RAG Package for complete API reference