Data Storage Formats & Optimization • File-level Partitioning StrategiesMedium⏱️ ~3 min
Partition Strategies and Directory Layout
Choosing Your Partition Keys: The partition strategy determines both query performance and operational complexity. The fundamental question is: which columns appear in WHERE clauses most frequently, and how can you organize directories to maximize pruning?
Real World Scale: Consider an analytics platform storing 5 TB of compressed data per day. With 256 MB target file size, that is roughly 20,000 files per day. Using composite partitioning by date then city (100 active cities), you get 200 files per city per day. Over a 180 day retention window, this creates 36,000 logical partitions with 3.6 million total files.
Modern table formats like Apache Iceberg and Delta Lake support hidden partitioning. You declare transformations like
Cardinality Matters: Choose partition keys with moderate cardinality. Too low (only 3 regions) creates skewed partitions where some are 100x larger than others. Too high (millions of unique user IDs) explodes directory count and metadata overhead. Sweet spot is typically 50 to 500 unique values per partition level. For high cardinality dimensions like user_id, use hash bucketing into fixed buckets (64 or 128) rather than direct partitioning.
1
Range Partitioning: Partition by continuous values like date, timestamp, or numeric ID ranges. Example:
event_date=2024-12-25. Perfect for time series data where queries filter by date ranges.2
Hash Partitioning: Apply hash function to distribute data evenly. Example:
bucket=47 where bucket equals hash(user_id) mod 128. Spreads write load uniformly but makes range queries expensive.3
List Partitioning: Partition by enumerated values like country, region, or tenant. Example:
region=US. Natural for categorical data with known, bounded cardinality.4
Composite Partitioning: Combine multiple levels. Example:
dt=2024-12-25/region=US/. First level prunes by date, second level prunes by region within that date.date(trip_start_time) or bucket(64, user_id) in metadata without exposing physical partition columns. The query engine automatically applies these transforms during planning. This simplifies schema evolution because you can change partition strategies without rewriting queries.
Metadata Planning Time
< 500ms
WELL PARTITIONED
10+ sec
POOR LAYOUT
💡 Key Takeaways
✓Range partitioning by time works best for append only workloads where 95% of queries filter by date ranges, enabling partition pruning across temporal dimensions
✓Hash partitioning distributes writes evenly across 64 to 128 buckets, avoiding hot partitions but requiring full scans for range queries on the hashed column
✓Composite partitioning combines levels like date then region, creating manageable partition counts (36,000 partitions for 180 days times 200 regions) while supporting multiple filter dimensions
✓Hidden partitioning in Iceberg or Delta stores transformations as metadata, letting you evolve partition strategy without breaking existing queries
📌 Examples
1A ride sharing company uses dt=2024-12-25/city=SF/ composite partitioning, creating 200 files per city per day across 100 cities, resulting in 20,000 files daily and 3.6 million files over 6 months
2Uber applies hash(user_id) mod 128 to create 128 buckets per date partition, spreading write load evenly and preventing single user hotspots from creating oversized partitions