Vector Search
Built‑in vector search with cosine similarity, Euclidean distance, and hybrid scoring. Includes performance optimizations like pre‑computed magnitudes and caching.
What is Vector Search?
Vector search enables semantic similarity search by comparing numerical representations (embeddings) of text, images, or other data. Instead of exact keyword matching, vector search finds documents that are semantically similar based on their meaning, making it ideal for AI/ML applications like RAG (Retrieval Augmented Generation), recommendation systems, and semantic search.
Why Use Vector Search?
- Semantic Understanding: Finds documents with similar meaning, not just matching keywords
- AI/ML Integration: Essential for RAG systems, chatbots, and recommendation engines
- Multi-modal Search: Works with text, images, audio, and any data that can be embedded
- Performance: Optimized with pre-computed magnitudes and caching for fast queries
- Flexible: Supports multiple distance metrics and hybrid search combining vectors with keywords
How It Works
Vector search works by:
- Embedding Generation: Convert text/data into numerical vectors (embeddings) using models like OpenAI, Cohere, or sentence-transformers
- Storage: Store embeddings alongside your documents in ArangoDB
- Similarity Calculation: Compute similarity scores between query vector and document vectors using distance metrics
- Ranking: Return documents sorted by similarity score
Getting Started
1. Get VectorSearch Instance
import { getVectorSearch, getDatabase } from 'arango-typed';
// Get instance (recommended)
const vectorSearch = getVectorSearch();
// Or create directly
const vectorSearch = new VectorSearch(getDatabase());
2. Generate Embeddings
You'll need an embedding model to convert text into vectors. Popular options:
// Using OpenAI
import OpenAI from 'openai';
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
async function embed(text: string): Promise {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: text
});
return response.data[0].embedding;
}
// Using sentence-transformers (Node.js)
// npm install @xenova/transformers
import { pipeline } from '@xenova/transformers';
const embedder = await pipeline('feature-extraction', 'Xenova/all-MiniLM-L6-v2');
async function embed(text: string): Promise {
const output = await embedder(text, { pooling: 'mean', normalize: true });
return Array.from(output.data);
}
Storing Documents with Embeddings
Basic Storage
import { getVectorSearch } from 'arango-typed';
import { getDatabase } from 'arango-typed';
const vectorSearch = getVectorSearch();
const db = getDatabase();
// Store document with embedding
const document = {
text: 'ArangoDB is a multi-model database',
title: 'Introduction to ArangoDB',
embedding: await embed('ArangoDB is a multi-model database'),
category: 'database',
createdAt: new Date()
};
await db.collection('documents').save(document);
Batch Storage with Precomputed Magnitudes
For better performance, precompute vector magnitudes when storing:
import { VectorSearch } from 'arango-typed';
const documents = [
{ text: 'Document 1', embedding: await embed('Document 1') },
{ text: 'Document 2', embedding: await embed('Document 2') },
{ text: 'Document 3', embedding: await embed('Document 3') }
];
// Precompute magnitudes for faster cosine similarity
const documentsWithMagnitudes = documents.map(doc => ({
...doc,
_vectorMagnitude: VectorSearch.computeMagnitude(doc.embedding)
}));
await db.collection('documents').import(documentsWithMagnitudes);
Similarity Search Methods
1. Cosine Similarity (Default)
Cosine similarity measures the angle between two vectors, making it ideal for text embeddings. It ranges from -1 (opposite) to 1 (identical). Values closer to 1 indicate higher similarity.
Formula: cosine_similarity = dot(a, b) / (||a|| * ||b||)
const queryEmbedding = await embed('What is ArangoDB?');
const results = await vectorSearch.similaritySearch(
'documents',
queryEmbedding,
{
limit: 10, // Number of results
threshold: 0.7, // Minimum similarity score (0.0 to 1.0)
usePrecomputedMagnitudes: true, // Use precomputed magnitudes (faster)
filter: { category: 'database' }, // Optional: filter by metadata
cache: cacheManager // Optional: cache results
}
);
// Results include _similarity score
results.forEach(result => {
console.log(result.text, result._similarity);
});
2. Euclidean Distance
Euclidean distance measures the straight-line distance between vectors. Lower values indicate higher similarity. Useful when vector magnitudes are meaningful (e.g., normalized embeddings).
Formula: distance = sqrt(sum((a[i] - b[i])²))
const results = await vectorSearch.euclideanSearch(
'documents',
queryEmbedding,
{
limit: 10,
threshold: 2.0, // Maximum distance (lower = more similar)
filter: { active: true } // Optional: filter by metadata
}
);
// Results include _distance
results.forEach(result => {
console.log(result.text, result._distance);
});
3. Dot Product
Dot product is the sum of element-wise products. Faster than cosine similarity but requires normalized vectors for meaningful results.
const results = await vectorSearch.similaritySearch(
'documents',
queryEmbedding,
{
distance: 'dot', // Use dot product
limit: 10,
threshold: 0.5
}
);
4. Hybrid Search
Hybrid search combines vector similarity with keyword search (BM25) for best results. Useful when you want both semantic understanding and exact keyword matching.
const results = await vectorSearch.hybridSearch(
'documents',
queryEmbedding,
'multi-model database', // Keywords for BM25 search
{
limit: 10,
threshold: 0.5,
keywordWeight: 0.4, // Weight for keyword score (0.0 to 1.0)
vectorWeight: 0.6, // Weight for vector score (0.0 to 1.0)
filter: { category: 'database' }
}
);
// Results include combined score and individual scores
results.forEach(result => {
console.log(result.text);
console.log(' Combined:', result._score);
console.log(' Vector:', result._vectorScore);
console.log(' Keyword:', result._keywordScore);
});
Precomputing Vector Magnitudes
For large collections, precomputing vector magnitudes significantly improves cosine similarity performance. The magnitude (L2 norm) is computed once and stored, avoiding recalculation on every search.
Compute Magnitude for Single Vector
import { VectorSearch } from 'arango-typed';
const embedding = [0.1, 0.2, 0.3, 0.4, ...];
const magnitude = VectorSearch.computeMagnitude(embedding);
console.log(magnitude); // L2 norm: sqrt(sum(v²))
Ensure All Documents Have Magnitudes
Automatically compute and store magnitudes for all documents missing them:
// Compute magnitudes for all documents without them
await vectorSearch.ensureMagnitudes(
'documents',
'embedding', // Vector field name
'_vectorMagnitude' // Field to store magnitude
);
Update Existing Documents
// When storing new documents, include magnitude
const doc = {
text: 'New document',
embedding: await embed('New document'),
_vectorMagnitude: VectorSearch.computeMagnitude(await embed('New document'))
};
await db.collection('documents').save(doc);
Filtering and Metadata
Combine vector search with metadata filtering for precise results:
const results = await vectorSearch.similaritySearch(
'documents',
queryEmbedding,
{
limit: 10,
threshold: 0.7,
filter: {
category: 'database', // Exact match
published: true, // Boolean filter
views: { $gte: 100 }, // Range filter
tags: { $in: ['tech', 'database'] } // Array contains
}
}
);
Caching
Cache search results for frequently queried vectors to improve performance:
import { CacheManager, MemoryCache } from 'arango-typed';
const cache = new MemoryCache();
const cacheManager = new CacheManager(cache);
const results = await vectorSearch.similaritySearch(
'documents',
queryEmbedding,
{
limit: 10,
cache: cacheManager // Results cached for 5 minutes
}
);
// Subsequent identical queries return cached results instantly
Complete RAG Example
Full example of a RAG (Retrieval Augmented Generation) system:
import { getVectorSearch, getDatabase, VectorSearch } from 'arango-typed';
import OpenAI from 'openai';
const vectorSearch = getVectorSearch();
const db = getDatabase();
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });
// Helper function to generate embeddings
async function embed(text: string): Promise {
const response = await openai.embeddings.create({
model: 'text-embedding-3-small',
input: text
});
return response.data[0].embedding;
}
// 1. Store documents with embeddings
async function indexDocument(text: string, metadata: any) {
const embedding = await embed(text);
const magnitude = VectorSearch.computeMagnitude(embedding);
await db.collection('documents').save({
text,
embedding,
_vectorMagnitude: magnitude,
...metadata,
indexedAt: new Date()
});
}
// 2. Search for relevant context
async function retrieveContext(query: string, topK: number = 5) {
const queryEmbedding = await embed(query);
const results = await vectorSearch.similaritySearch(
'documents',
queryEmbedding,
{
limit: topK,
threshold: 0.7,
usePrecomputedMagnitudes: true
}
);
return results.map(r => r.text);
}
// 3. Generate answer with context
async function answerQuestion(question: string) {
const context = await retrieveContext(question);
const prompt = `Context:
${context.join('\n\n')}
Question: ${question}
Answer:`;
const response = await openai.chat.completions.create({
model: 'gpt-4',
messages: [{ role: 'user', content: prompt }]
});
return response.choices[0].message.content;
}
// Usage
await indexDocument('ArangoDB is a multi-model database...', { category: 'database' });
const answer = await answerQuestion('What is ArangoDB?');
Performance Optimization
1. Precompute Magnitudes
Always precompute and store vector magnitudes. This can improve cosine similarity performance by 2-3x:
// ✅ Good: Precompute magnitude
const doc = {
embedding: vector,
_vectorMagnitude: VectorSearch.computeMagnitude(vector)
};
// ❌ Bad: Compute on every search
const doc = { embedding: vector };
2. Use Appropriate Distance Metric
- Cosine Similarity: Best for text embeddings (OpenAI, Cohere, sentence-transformers)
- Euclidean Distance: Best for normalized embeddings or when magnitude matters
- Dot Product: Fastest, but requires normalized vectors
3. Set Reasonable Thresholds
Use thresholds to filter out low-quality matches:
// Cosine similarity: 0.7+ is typically good
{ threshold: 0.7 }
// Euclidean distance: Lower is better (depends on vector dimensions)
{ threshold: 1.5 }
4. Use Filters to Narrow Search Space
Combine vector search with metadata filters to reduce computation:
// Search only in specific category
{ filter: { category: 'database', published: true } }
5. Index Vector Fields
Create indexes on vector fields for faster filtering:
await vectorSearch.createVectorIndex('documents', 'embedding');
6. Cache Frequently Queried Vectors
Cache results for common queries to avoid recomputation:
const cache = new MemoryCache();
const cacheManager = new CacheManager(cache);
// Results cached automatically
await vectorSearch.similaritySearch('documents', queryVector, { cache: cacheManager });
Best Practices
- Normalize Embeddings: Ensure your embedding model produces normalized vectors for consistent results
- Precompute Magnitudes: Always store
_vectorMagnitudewhen indexing documents - Use Appropriate Dimensions: Most models use 384, 512, or 1536 dimensions. Higher dimensions = better accuracy but slower queries
- Batch Operations: When indexing many documents, use batch imports with precomputed magnitudes
- Monitor Performance: Track query times and adjust thresholds/limits based on your use case
- Combine with Filters: Use metadata filters to narrow search space before vector comparison
- Test Different Metrics: Try cosine, euclidean, and hybrid search to find what works best for your data
Common Use Cases
- RAG Systems: Retrieve relevant context for LLM prompts
- Semantic Search: Find documents by meaning, not keywords
- Recommendation Engines: Find similar items based on embeddings
- Duplicate Detection: Find near-duplicate documents
- Content Classification: Classify documents based on similarity to known examples
- Question Answering: Find relevant passages for answering questions