Aggregation & Calculations
Use the aggregation builder for counts, grouping, and calculations without writing raw AQL.
What is Aggregation?
Aggregation allows you to perform calculations and transformations on your data, such as counting documents, grouping by fields, calculating sums, averages, and more. arango-typed provides a fluent API for building aggregation queries that compile to efficient AQL.
Why Use Aggregation?
- Performance: Aggregations run in the database, reducing data transfer
- Efficiency: Calculate statistics without loading all documents into memory
- Flexibility: Chain multiple operations for complex calculations
- Type Safety: TypeScript support for aggregation results
- Readability: Fluent API is easier to read than raw AQL
Getting Started
Start an aggregation query using the aggregate() method on a model:
import { Schema, model } from 'arango-typed';
const UserSchema = new Schema({
name: String,
email: String,
age: Number,
country: String,
active: Boolean
});
const User = model('users', UserSchema);
// Start aggregation
const aggregation = User.aggregate();
Basic Count
Count documents matching a condition:
// Count all active users
const activeUsers = await User.aggregate({
where: { active: true }
}).execute();
// Or use the count() method on regular queries
const count = await User.find({ active: true }).count();
Group By
Group documents by one or more fields and calculate aggregations:
Single Field Grouping
// Group by country and count users per country
const byCountry = await User.aggregate()
.groupBy('country')
.aggregate({
total: { $count: true }
})
.sort({ total: -1 })
.limit(10)
.execute();
// Result:
// [
// { country: 'USA', total: 150 },
// { country: 'UK', total: 75 },
// { country: 'Canada', total: 50 }
// ]
Multiple Field Grouping
// Group by country and status
const byCountryAndStatus = await User.aggregate()
.groupBy(['country', 'active'])
.aggregate({
count: { $count: true }
})
.execute();
// Result:
// [
// { country: 'USA', active: true, count: 100 },
// { country: 'USA', active: false, count: 50 },
// { country: 'UK', active: true, count: 60 }
// ]
Aggregation Functions
arango-typed supports various aggregation functions:
Sum
Calculate the sum of numeric values:
const OrderSchema = new Schema({
userId: String,
amount: Number,
status: String,
createdAt: Date
});
const Order = model('orders', OrderSchema);
// Total revenue by status
const revenueByStatus = await Order.aggregate()
.groupBy('status')
.aggregate({
totalRevenue: { $sum: 'amount' },
orderCount: { $count: true }
})
.execute();
// Result:
// [
// { status: 'paid', totalRevenue: 50000, orderCount: 200 },
// { status: 'pending', totalRevenue: 10000, orderCount: 50 }
// ]
Average
Calculate the average of numeric values:
// Average order amount by status
const avgByStatus = await Order.aggregate()
.groupBy('status')
.aggregate({
avgAmount: { $avg: 'amount' },
totalAmount: { $sum: 'amount' },
count: { $count: true }
})
.execute();
Min and Max
Find minimum and maximum values:
// Order statistics by status
const stats = await Order.aggregate()
.groupBy('status')
.aggregate({
minAmount: { $min: 'amount' },
maxAmount: { $max: 'amount' },
avgAmount: { $avg: 'amount' },
totalAmount: { $sum: 'amount' },
count: { $count: true }
})
.execute();
// Result:
// [
// {
// status: 'paid',
// minAmount: 10,
// maxAmount: 1000,
// avgAmount: 250,
// totalAmount: 50000,
// count: 200
// }
// ]
Count
Count documents in each group:
// Count users by country
const userCounts = await User.aggregate()
.groupBy('country')
.aggregate({
userCount: { $count: true }
})
.sort({ userCount: -1 })
.execute();
Filtering (WHERE Clause)
Filter documents before aggregation:
Basic Filtering
// Revenue for paid orders only
const paidRevenue = await Order.aggregate({
where: { status: 'paid' }
})
.groupBy('userId')
.aggregate({
totalSpent: { $sum: 'amount' },
orderCount: { $count: true }
})
.execute();
Complex Filtering
// Revenue for paid orders in the last 30 days
const recentRevenue = await Order.aggregate({
where: {
status: 'paid',
createdAt: { $gte: new Date(Date.now() - 30 * 24 * 60 * 60 * 1000) }
}
})
.groupBy('userId')
.aggregate({
totalSpent: { $sum: 'amount' }
})
.execute();
HAVING Clause
Filter groups after aggregation (similar to SQL HAVING):
// Countries with more than 100 users
const largeCountries = await User.aggregate()
.groupBy('country')
.aggregate({
userCount: { $count: true }
})
.having({ userCount: { $gte: 100 } })
.sort({ userCount: -1 })
.execute();
// Users who spent more than $1000
const bigSpenders = await Order.aggregate({
where: { status: 'paid' }
})
.groupBy('userId')
.aggregate({
totalSpent: { $sum: 'amount' }
})
.having({ totalSpent: { $gte: 1000 } })
.sort({ totalSpent: -1 })
.execute();
Sorting Results
Sort aggregation results:
Single Field Sort
// Top countries by user count
const topCountries = await User.aggregate()
.groupBy('country')
.aggregate({
userCount: { $count: true }
})
.sort({ userCount: -1 }) // Descending
.limit(10)
.execute();
Multiple Field Sort
// Sort by country (ascending), then by user count (descending)
const sorted = await User.aggregate()
.groupBy(['country', 'status'])
.aggregate({
userCount: { $count: true }
})
.sort([
{ field: 'country', direction: 1 },
{ field: 'userCount', direction: -1 }
])
.execute();
Pagination
Limit and skip results for pagination:
// Paginated top countries
const page = 1;
const pageSize = 20;
const skip = (page - 1) * pageSize;
const topCountries = await User.aggregate()
.groupBy('country')
.aggregate({
userCount: { $count: true }
})
.sort({ userCount: -1 })
.skip(skip)
.limit(pageSize)
.execute();
Selecting Fields
Select specific fields in the result:
// Only return country and userCount
const countries = await User.aggregate()
.groupBy('country')
.aggregate({
userCount: { $count: true }
})
.select(['country', 'userCount'])
.execute();
Complete Aggregation Examples
Example 1: Sales Analytics
const OrderSchema = new Schema({
userId: String,
productId: String,
amount: Number,
quantity: Number,
status: String,
createdAt: Date,
country: String
});
const Order = model('orders', OrderSchema);
// Sales by country and status
const salesAnalytics = await Order.aggregate({
where: {
status: { $in: ['paid', 'completed'] },
createdAt: { $gte: new Date('2024-01-01') }
}
})
.groupBy(['country', 'status'])
.aggregate({
totalRevenue: { $sum: 'amount' },
totalQuantity: { $sum: 'quantity' },
avgOrderValue: { $avg: 'amount' },
orderCount: { $count: true },
minOrder: { $min: 'amount' },
maxOrder: { $max: 'amount' }
})
.having({ totalRevenue: { $gte: 1000 } })
.sort({ totalRevenue: -1 })
.limit(50)
.execute();
Example 2: User Statistics
const UserSchema = new Schema({
name: String,
email: String,
age: Number,
country: String,
active: Boolean,
createdAt: Date
});
const User = model('users', UserSchema);
// User statistics by country
const userStats = await User.aggregate({
where: { active: true }
})
.groupBy('country')
.aggregate({
totalUsers: { $count: true },
avgAge: { $avg: 'age' },
minAge: { $min: 'age' },
maxAge: { $max: 'age' }
})
.sort({ totalUsers: -1 })
.execute();
// Result:
// [
// {
// country: 'USA',
// totalUsers: 500,
// avgAge: 32.5,
// minAge: 18,
// maxAge: 80
// },
// ...
// ]
Example 3: Time-Based Aggregations
// Daily revenue (requires date extraction in AQL)
// Note: This is a simplified example. For date grouping,
// you may need to use raw AQL or add date fields to your schema
const dailyRevenue = await Order.aggregate({
where: {
status: 'paid',
createdAt: { $gte: new Date('2024-01-01') }
}
})
.groupBy('createdAt') // Group by date (you may need to normalize dates)
.aggregate({
revenue: { $sum: 'amount' },
orderCount: { $count: true }
})
.sort({ createdAt: -1 })
.execute();
Example 4: Multi-Level Grouping
// Group by multiple fields
const multiLevel = await Order.aggregate({
where: { status: 'paid' }
})
.groupBy(['country', 'status', 'productId'])
.aggregate({
totalRevenue: { $sum: 'amount' },
totalQuantity: { $sum: 'quantity' },
orderCount: { $count: true }
})
.sort([
{ field: 'country', direction: 1 },
{ field: 'totalRevenue', direction: -1 }
])
.execute();
Combining Multiple Aggregations
You can calculate multiple aggregations in a single query:
// Comprehensive order statistics
const comprehensiveStats = await Order.aggregate({
where: { status: 'paid' }
})
.groupBy('userId')
.aggregate({
totalSpent: { $sum: 'amount' },
avgOrderValue: { $avg: 'amount' },
minOrder: { $min: 'amount' },
maxOrder: { $max: 'amount' },
orderCount: { $count: true }
})
.having({ orderCount: { $gte: 5 } })
.sort({ totalSpent: -1 })
.limit(100)
.execute();
Using with Multi-Tenancy
Aggregations automatically respect tenant context when multi-tenancy is enabled:
const User = model('users', UserSchema, { tenantEnabled: true });
// This aggregation automatically filters by tenant
const stats = await User.aggregate({
where: { active: true }
})
.groupBy('country')
.aggregate({
userCount: { $count: true }
})
.execute();
// Only includes users for the current tenant
Performance Optimization
Tips for optimizing aggregation queries:
1. Use Indexes
Create indexes on fields used in WHERE and GROUP BY clauses:
const OrderSchema = new Schema({
userId: String,
status: String,
amount: Number,
country: String,
createdAt: Date
});
// Index fields used in aggregations
OrderSchema.index(['status', 'country']);
OrderSchema.index(['userId', 'status']);
OrderSchema.index('createdAt');
2. Filter Early
Use WHERE clauses to filter documents before grouping:
// ✅ Good: Filter before grouping
const stats = await Order.aggregate({
where: { status: 'paid' } // Filter first
})
.groupBy('country')
.aggregate({ revenue: { $sum: 'amount' } })
.execute();
// ❌ Bad: Group all, then filter (less efficient)
3. Limit Results
Use LIMIT to reduce result set size:
// Only get top 10 results
const topCountries = await User.aggregate()
.groupBy('country')
.aggregate({ userCount: { $count: true } })
.sort({ userCount: -1 })
.limit(10) // Limit results
.execute();
4. Select Only Needed Fields
Use SELECT to return only necessary fields:
// Only return country and count
const countries = await User.aggregate()
.groupBy('country')
.aggregate({ userCount: { $count: true } })
.select(['country', 'userCount']) // Only these fields
.execute();
Common Patterns
Top N by Count
// Top 10 countries by user count
const topCountries = await User.aggregate()
.groupBy('country')
.aggregate({ userCount: { $count: true } })
.sort({ userCount: -1 })
.limit(10)
.execute();
Total and Average
// Total and average order value
const orderStats = await Order.aggregate({
where: { status: 'paid' }
})
.aggregate({
totalRevenue: { $sum: 'amount' },
avgOrderValue: { $avg: 'amount' },
orderCount: { $count: true }
})
.execute();
// Result: [{ totalRevenue: 50000, avgOrderValue: 250, orderCount: 200 }]
Min/Max with Grouping
// Price range by product category
const priceRanges = await Product.aggregate()
.groupBy('category')
.aggregate({
minPrice: { $min: 'price' },
maxPrice: { $max: 'price' },
avgPrice: { $avg: 'price' },
productCount: { $count: true }
})
.execute();
Filtered Aggregations
// Revenue for high-value orders only
const highValueStats = await Order.aggregate({
where: {
status: 'paid',
amount: { $gte: 100 }
}
})
.groupBy('country')
.aggregate({
totalRevenue: { $sum: 'amount' },
orderCount: { $count: true }
})
.having({ totalRevenue: { $gte: 10000 } })
.execute();
Error Handling
Always handle errors when executing aggregations:
try {
const stats = await Order.aggregate()
.groupBy('status')
.aggregate({
totalRevenue: { $sum: 'amount' }
})
.execute();
console.log('Stats:', stats);
} catch (error) {
console.error('Aggregation failed:', error);
// Handle error appropriately
}
Best Practices
- Index Group By Fields: Create indexes on fields used in GROUP BY
- Filter Early: Use WHERE clauses to reduce data before grouping
- Use HAVING for Group Filters: Filter groups after aggregation, not before
- Limit Results: Use LIMIT to prevent large result sets
- Select Only Needed Fields: Use SELECT to reduce data transfer
- Combine Aggregations: Calculate multiple stats in one query when possible
- Test Performance: Use ArangoDB query profiler to optimize slow queries
- Handle Errors: Always wrap aggregation calls in try-catch
- Use Multi-Tenancy: Enable tenant filtering for SaaS applications
- Document Complex Queries: Add comments for complex aggregation logic
Limitations and Considerations
- Memory Usage: Large aggregations may consume significant memory
- Performance: Complex aggregations on large datasets can be slow
- Index Requirements: Some aggregations require specific indexes
- Date Grouping: Date-based grouping may require date normalization
- Raw AQL: For very complex aggregations, consider using raw AQL