Real-time Analytics & OLAP • Pre-aggregation & Rollup PatternsEasy⏱️ ~2 min
What is Pre-aggregation?
Definition
Pre-aggregation means computing summary metrics ahead of time rather than calculating them for every query. Instead of scanning billions of raw events each time someone loads a dashboard, you maintain compact tables of already computed sums, counts, and averages.
Query Performance Impact
RAW DATA
30 sec
→
PRE-AGGREGATED
200 ms
💡 Key Takeaways
✓Pre-aggregation computes summary metrics once and stores them, avoiding repeated expensive calculations over raw data
✓Reduces data volume scanned per query by 100x to 1000x, turning 30 second queries into 200 millisecond responses
✓Primary trade-off is extra storage and pipeline complexity for dramatically faster query performance
✓Most valuable when the same aggregations are queried repeatedly, such as dashboards viewed hundreds of times daily
📌 Examples
1A consumer app logs 5 million events per second. Querying raw data for daily active users takes 30 seconds. Pre-aggregating by country and day reduces this to 10 million rows, enabling 200ms query times.
2An e-commerce site maintains hourly aggregates of revenue by product category. Instead of scanning 2 billion transaction records, queries hit a table with 50,000 rows (100 categories times 500 hours).