Data Warehousing Fundamentals • Partitioning & Clustering StrategiesHard⏱️ ~3 min
When to Use Partitioning vs Alternatives
The Decision Framework:
Partitioning is not a universal solution. It trades operational complexity and flexibility for targeted query performance. The key decision is whether your workload has predictable, high selectivity filters that align with partition keys.
Partitioning for Analytical Workloads:
Analytical systems with append only writes, large scans, and time based or tenant based filters are ideal for partitioning. Consider a dashboard showing daily active users over the past 30 days. With date partitioning, the query scans exactly 30 partitions. Without partitioning, it scans years of history. At 100 TB total and 2 TB per day, that is 60 TB scanned versus 2 to 3 TB, a 20x to 30x reduction. Query latency drops from several minutes to 10 to 20 seconds.
The math works because analytical queries are often range scans or aggregations over large chunks of data. Partition pruning eliminates huge swaths of irrelevant data upfront. Clustering within partitions further optimizes by enabling block skipping based on zone maps.
Indexes for Transactional Point Reads:
Online Transaction Processing (OLTP) systems need fast point reads and updates. Consider a user profile lookup by
Use Partitioning When
90%+ queries filter on same key (date, region, tenant), read/write ratio over 100:1, data volume over 1 TB
vs
Use Indexes When
Ad hoc queries on many columns, low latency point reads under 50ms, frequent updates or deletes
user_id with target latency under 10 milliseconds (ms). A B tree index on user_id provides O(log n) lookup, typically 3 to 4 disk seeks for millions of rows, completing in 5 to 15 ms.
Partitioning alone cannot achieve this. Even with great partition pruning, you still scan a partition containing thousands or millions of rows. For low latency point queries, indexes are essential. The trade off is that indexes add write overhead. Each insert or update must modify the index, reducing write throughput. A table with 5 indexes performs 6 writes per insert, dropping throughput from 50,000 to 8,000 inserts per second.
Materialized Views for Complex Aggregations:
When queries repeatedly compute expensive aggregations like daily unique users or hourly revenue by region, materialized views precompute results. Instead of scanning raw events every time, queries read a pre aggregated table. This is complementary to partitioning. You partition the raw event table by date and cluster by user, then build a materialized view that aggregates daily stats, also partitioned by date.
The trade off is staleness and maintenance cost. Materialized views lag behind source data by minutes to hours depending on refresh frequency. Refreshing a materialized view over 100 TB of raw data can take 30 to 60 minutes and consume significant compute.
Sharding for Write Scalability:
In distributed OLTP systems, partitioning is often called sharding, and the goal shifts from query optimization to write scalability. Systems like Amazon DynamoDB or Cassandra hash partition by primary key to distribute writes across hundreds or thousands of nodes. Each node handles a subset of the key space, enabling aggregate write throughput of 200,000+ operations per second.
Range partitioning in OLTP is less common because it risks hot spots. Sequential IDs or timestamps create hot partitions where recent data receives all writes. Hash partitioning spreads load evenly but breaks range query locality. You cannot efficiently scan all users in a range without hitting many shards.
⚠️ Common Pitfall: Applying analytical partitioning strategies to OLTP systems or vice versa. Date partitioning in a user profile database where lookups are by
The Interview Answer:
When asked about partitioning, start by clarifying the workload. For OLAP (Online Analytical Processing) with time series data and read heavy patterns (over 100:1 read/write ratio), partition by time and cluster by common join keys. For OLTP with point reads and balanced read/write, use indexes and consider hash sharding for horizontal scaling. For hybrid workloads, use partitioning plus indexes on hot columns, or separate the workload into analytical and transactional databases with replication between them. Always quantify the trade off: partitioning saves X percent scan cost but adds Y complexity in schema management and operational overhead.user_id provides no benefit and complicates schema management.💡 Key Takeaways
✓Partitioning excels for analytical workloads with time or tenant filters, read/write ratios over 100:1, and data volumes over 1 TB, reducing scans by 20x to 100x
✓Indexes are essential for OLTP point reads targeting under 10 to 20 ms latency, but each index adds write overhead reducing insert throughput by 15 to 20 percent
✓Materialized views complement partitioning for repeated aggregations, trading staleness (minutes to hours lag) and refresh cost for faster query response
✓Hash partitioning (sharding) in OLTP distributes writes evenly across nodes for scalability but breaks range query locality, while range partitioning risks hot spots with sequential keys
✓The decision depends on read/write ratio, query selectivity, latency requirements, and whether workload is analytical scans or transactional point queries
📌 Examples
1Analytical: 100 TB event table, 90% of queries filter on date, partitioning reduces scan from 100 TB to 2 TB, saving $50 per query at $0.05 per GB scanned
2OLTP: User lookup by user_id needs under 10 ms latency, B tree index achieves 5 to 8 ms, partitioning alone would still scan thousands of rows taking 100+ ms
3Hybrid: E commerce system with transactional database for orders (indexed by order_id, 50k writes/sec) replicated to analytical warehouse (partitioned by order_date for BI dashboards)
4Sharding: DynamoDB table hash partitioned by customer_id across 500 partitions, supporting 200k writes/sec, but customer ID range scans require scatter gather across all partitions