Aggregation & Calculations

Use the aggregation builder for counts, grouping, and calculations without writing raw AQL.

What is Aggregation?

Aggregation allows you to perform calculations and transformations on your data, such as counting documents, grouping by fields, calculating sums, averages, and more. arango-typed provides a fluent API for building aggregation queries that compile to efficient AQL.

Why Use Aggregation?

  • Performance: Aggregations run in the database, reducing data transfer
  • Efficiency: Calculate statistics without loading all documents into memory
  • Flexibility: Chain multiple operations for complex calculations
  • Type Safety: TypeScript support for aggregation results
  • Readability: Fluent API is easier to read than raw AQL

Getting Started

Start an aggregation query using the aggregate() method on a model:

import { Schema, model } from 'arango-typed';

const UserSchema = new Schema({
  name: String,
  email: String,
  age: Number,
  country: String,
  active: Boolean
});

const User = model('users', UserSchema);

// Start aggregation
const aggregation = User.aggregate();

Basic Count

Count documents matching a condition:

// Count all active users
const activeUsers = await User.aggregate({
  where: { active: true }
}).execute();

// Or use the count() method on regular queries
const count = await User.find({ active: true }).count();

Group By

Group documents by one or more fields and calculate aggregations:

Single Field Grouping

// Group by country and count users per country
const byCountry = await User.aggregate()
  .groupBy('country')
  .aggregate({
    total: { $count: true }
  })
  .sort({ total: -1 })
  .limit(10)
  .execute();

// Result:
// [
//   { country: 'USA', total: 150 },
//   { country: 'UK', total: 75 },
//   { country: 'Canada', total: 50 }
// ]

Multiple Field Grouping

// Group by country and status
const byCountryAndStatus = await User.aggregate()
  .groupBy(['country', 'active'])
  .aggregate({
    count: { $count: true }
  })
  .execute();

// Result:
// [
//   { country: 'USA', active: true, count: 100 },
//   { country: 'USA', active: false, count: 50 },
//   { country: 'UK', active: true, count: 60 }
// ]

Aggregation Functions

arango-typed supports various aggregation functions:

Sum

Calculate the sum of numeric values:

const OrderSchema = new Schema({
  userId: String,
  amount: Number,
  status: String,
  createdAt: Date
});

const Order = model('orders', OrderSchema);

// Total revenue by status
const revenueByStatus = await Order.aggregate()
  .groupBy('status')
  .aggregate({
    totalRevenue: { $sum: 'amount' },
    orderCount: { $count: true }
  })
  .execute();

// Result:
// [
//   { status: 'paid', totalRevenue: 50000, orderCount: 200 },
//   { status: 'pending', totalRevenue: 10000, orderCount: 50 }
// ]

Average

Calculate the average of numeric values:

// Average order amount by status
const avgByStatus = await Order.aggregate()
  .groupBy('status')
  .aggregate({
    avgAmount: { $avg: 'amount' },
    totalAmount: { $sum: 'amount' },
    count: { $count: true }
  })
  .execute();

Min and Max

Find minimum and maximum values:

// Order statistics by status
const stats = await Order.aggregate()
  .groupBy('status')
  .aggregate({
    minAmount: { $min: 'amount' },
    maxAmount: { $max: 'amount' },
    avgAmount: { $avg: 'amount' },
    totalAmount: { $sum: 'amount' },
    count: { $count: true }
  })
  .execute();

// Result:
// [
//   {
//     status: 'paid',
//     minAmount: 10,
//     maxAmount: 1000,
//     avgAmount: 250,
//     totalAmount: 50000,
//     count: 200
//   }
// ]

Count

Count documents in each group:

// Count users by country
const userCounts = await User.aggregate()
  .groupBy('country')
  .aggregate({
    userCount: { $count: true }
  })
  .sort({ userCount: -1 })
  .execute();

Filtering (WHERE Clause)

Filter documents before aggregation:

Basic Filtering

// Revenue for paid orders only
const paidRevenue = await Order.aggregate({
  where: { status: 'paid' }
})
  .groupBy('userId')
  .aggregate({
    totalSpent: { $sum: 'amount' },
    orderCount: { $count: true }
  })
  .execute();

Complex Filtering

// Revenue for paid orders in the last 30 days
const recentRevenue = await Order.aggregate({
  where: {
    status: 'paid',
    createdAt: { $gte: new Date(Date.now() - 30 * 24 * 60 * 60 * 1000) }
  }
})
  .groupBy('userId')
  .aggregate({
    totalSpent: { $sum: 'amount' }
  })
  .execute();

HAVING Clause

Filter groups after aggregation (similar to SQL HAVING):

// Countries with more than 100 users
const largeCountries = await User.aggregate()
  .groupBy('country')
  .aggregate({
    userCount: { $count: true }
  })
  .having({ userCount: { $gte: 100 } })
  .sort({ userCount: -1 })
  .execute();

// Users who spent more than $1000
const bigSpenders = await Order.aggregate({
  where: { status: 'paid' }
})
  .groupBy('userId')
  .aggregate({
    totalSpent: { $sum: 'amount' }
  })
  .having({ totalSpent: { $gte: 1000 } })
  .sort({ totalSpent: -1 })
  .execute();

Sorting Results

Sort aggregation results:

Single Field Sort

// Top countries by user count
const topCountries = await User.aggregate()
  .groupBy('country')
  .aggregate({
    userCount: { $count: true }
  })
  .sort({ userCount: -1 })  // Descending
  .limit(10)
  .execute();

Multiple Field Sort

// Sort by country (ascending), then by user count (descending)
const sorted = await User.aggregate()
  .groupBy(['country', 'status'])
  .aggregate({
    userCount: { $count: true }
  })
  .sort([
    { field: 'country', direction: 1 },
    { field: 'userCount', direction: -1 }
  ])
  .execute();

Pagination

Limit and skip results for pagination:

// Paginated top countries
const page = 1;
const pageSize = 20;
const skip = (page - 1) * pageSize;

const topCountries = await User.aggregate()
  .groupBy('country')
  .aggregate({
    userCount: { $count: true }
  })
  .sort({ userCount: -1 })
  .skip(skip)
  .limit(pageSize)
  .execute();

Selecting Fields

Select specific fields in the result:

// Only return country and userCount
const countries = await User.aggregate()
  .groupBy('country')
  .aggregate({
    userCount: { $count: true }
  })
  .select(['country', 'userCount'])
  .execute();

Complete Aggregation Examples

Example 1: Sales Analytics

const OrderSchema = new Schema({
  userId: String,
  productId: String,
  amount: Number,
  quantity: Number,
  status: String,
  createdAt: Date,
  country: String
});

const Order = model('orders', OrderSchema);

// Sales by country and status
const salesAnalytics = await Order.aggregate({
  where: {
    status: { $in: ['paid', 'completed'] },
    createdAt: { $gte: new Date('2024-01-01') }
  }
})
  .groupBy(['country', 'status'])
  .aggregate({
    totalRevenue: { $sum: 'amount' },
    totalQuantity: { $sum: 'quantity' },
    avgOrderValue: { $avg: 'amount' },
    orderCount: { $count: true },
    minOrder: { $min: 'amount' },
    maxOrder: { $max: 'amount' }
  })
  .having({ totalRevenue: { $gte: 1000 } })
  .sort({ totalRevenue: -1 })
  .limit(50)
  .execute();

Example 2: User Statistics

const UserSchema = new Schema({
  name: String,
  email: String,
  age: Number,
  country: String,
  active: Boolean,
  createdAt: Date
});

const User = model('users', UserSchema);

// User statistics by country
const userStats = await User.aggregate({
  where: { active: true }
})
  .groupBy('country')
  .aggregate({
    totalUsers: { $count: true },
    avgAge: { $avg: 'age' },
    minAge: { $min: 'age' },
    maxAge: { $max: 'age' }
  })
  .sort({ totalUsers: -1 })
  .execute();

// Result:
// [
//   {
//     country: 'USA',
//     totalUsers: 500,
//     avgAge: 32.5,
//     minAge: 18,
//     maxAge: 80
//   },
//   ...
// ]

Example 3: Time-Based Aggregations

// Daily revenue (requires date extraction in AQL)
// Note: This is a simplified example. For date grouping,
// you may need to use raw AQL or add date fields to your schema

const dailyRevenue = await Order.aggregate({
  where: {
    status: 'paid',
    createdAt: { $gte: new Date('2024-01-01') }
  }
})
  .groupBy('createdAt')  // Group by date (you may need to normalize dates)
  .aggregate({
    revenue: { $sum: 'amount' },
    orderCount: { $count: true }
  })
  .sort({ createdAt: -1 })
  .execute();

Example 4: Multi-Level Grouping

// Group by multiple fields
const multiLevel = await Order.aggregate({
  where: { status: 'paid' }
})
  .groupBy(['country', 'status', 'productId'])
  .aggregate({
    totalRevenue: { $sum: 'amount' },
    totalQuantity: { $sum: 'quantity' },
    orderCount: { $count: true }
  })
  .sort([
    { field: 'country', direction: 1 },
    { field: 'totalRevenue', direction: -1 }
  ])
  .execute();

Combining Multiple Aggregations

You can calculate multiple aggregations in a single query:

// Comprehensive order statistics
const comprehensiveStats = await Order.aggregate({
  where: { status: 'paid' }
})
  .groupBy('userId')
  .aggregate({
    totalSpent: { $sum: 'amount' },
    avgOrderValue: { $avg: 'amount' },
    minOrder: { $min: 'amount' },
    maxOrder: { $max: 'amount' },
    orderCount: { $count: true }
  })
  .having({ orderCount: { $gte: 5 } })
  .sort({ totalSpent: -1 })
  .limit(100)
  .execute();

Using with Multi-Tenancy

Aggregations automatically respect tenant context when multi-tenancy is enabled:

const User = model('users', UserSchema, { tenantEnabled: true });

// This aggregation automatically filters by tenant
const stats = await User.aggregate({
  where: { active: true }
})
  .groupBy('country')
  .aggregate({
    userCount: { $count: true }
  })
  .execute();

// Only includes users for the current tenant

Performance Optimization

Tips for optimizing aggregation queries:

1. Use Indexes

Create indexes on fields used in WHERE and GROUP BY clauses:

const OrderSchema = new Schema({
  userId: String,
  status: String,
  amount: Number,
  country: String,
  createdAt: Date
});

// Index fields used in aggregations
OrderSchema.index(['status', 'country']);
OrderSchema.index(['userId', 'status']);
OrderSchema.index('createdAt');

2. Filter Early

Use WHERE clauses to filter documents before grouping:

// ✅ Good: Filter before grouping
const stats = await Order.aggregate({
  where: { status: 'paid' }  // Filter first
})
  .groupBy('country')
  .aggregate({ revenue: { $sum: 'amount' } })
  .execute();

// ❌ Bad: Group all, then filter (less efficient)

3. Limit Results

Use LIMIT to reduce result set size:

// Only get top 10 results
const topCountries = await User.aggregate()
  .groupBy('country')
  .aggregate({ userCount: { $count: true } })
  .sort({ userCount: -1 })
  .limit(10)  // Limit results
  .execute();

4. Select Only Needed Fields

Use SELECT to return only necessary fields:

// Only return country and count
const countries = await User.aggregate()
  .groupBy('country')
  .aggregate({ userCount: { $count: true } })
  .select(['country', 'userCount'])  // Only these fields
  .execute();

Common Patterns

Top N by Count

// Top 10 countries by user count
const topCountries = await User.aggregate()
  .groupBy('country')
  .aggregate({ userCount: { $count: true } })
  .sort({ userCount: -1 })
  .limit(10)
  .execute();

Total and Average

// Total and average order value
const orderStats = await Order.aggregate({
  where: { status: 'paid' }
})
  .aggregate({
    totalRevenue: { $sum: 'amount' },
    avgOrderValue: { $avg: 'amount' },
    orderCount: { $count: true }
  })
  .execute();

// Result: [{ totalRevenue: 50000, avgOrderValue: 250, orderCount: 200 }]

Min/Max with Grouping

// Price range by product category
const priceRanges = await Product.aggregate()
  .groupBy('category')
  .aggregate({
    minPrice: { $min: 'price' },
    maxPrice: { $max: 'price' },
    avgPrice: { $avg: 'price' },
    productCount: { $count: true }
  })
  .execute();

Filtered Aggregations

// Revenue for high-value orders only
const highValueStats = await Order.aggregate({
  where: {
    status: 'paid',
    amount: { $gte: 100 }
  }
})
  .groupBy('country')
  .aggregate({
    totalRevenue: { $sum: 'amount' },
    orderCount: { $count: true }
  })
  .having({ totalRevenue: { $gte: 10000 } })
  .execute();

Error Handling

Always handle errors when executing aggregations:

try {
  const stats = await Order.aggregate()
    .groupBy('status')
    .aggregate({
      totalRevenue: { $sum: 'amount' }
    })
    .execute();
  
  console.log('Stats:', stats);
} catch (error) {
  console.error('Aggregation failed:', error);
  // Handle error appropriately
}

Best Practices

  • Index Group By Fields: Create indexes on fields used in GROUP BY
  • Filter Early: Use WHERE clauses to reduce data before grouping
  • Use HAVING for Group Filters: Filter groups after aggregation, not before
  • Limit Results: Use LIMIT to prevent large result sets
  • Select Only Needed Fields: Use SELECT to reduce data transfer
  • Combine Aggregations: Calculate multiple stats in one query when possible
  • Test Performance: Use ArangoDB query profiler to optimize slow queries
  • Handle Errors: Always wrap aggregation calls in try-catch
  • Use Multi-Tenancy: Enable tenant filtering for SaaS applications
  • Document Complex Queries: Add comments for complex aggregation logic

Limitations and Considerations

  • Memory Usage: Large aggregations may consume significant memory
  • Performance: Complex aggregations on large datasets can be slow
  • Index Requirements: Some aggregations require specific indexes
  • Date Grouping: Date-based grouping may require date normalization
  • Raw AQL: For very complex aggregations, consider using raw AQL
📚 API Reference: For complete API documentation including all methods and TypeScript types, see Aggregation Module API Reference.
Next: Learn about Queries for advanced querying capabilities.