Hybrid Models: Composite Keys for Limited Range Scan Support

Pure hash based partitioning sacrifices range query performance because adjacent keys scatter across partitions, forcing expensive scatter gather operations that query all partitions and merge results. For workloads that need both uniform distribution and occasional range scans, hybrid composite key schemes provide a middle ground. The pattern is to hash the first component of a composite key for distribution while keeping the second component ordered within each partition. For example, in a multi tenant system, use tenant_id as the partition key (hashed to distribute tenants evenly) and timestamp as the clustering key (sorted within each tenant's partition). This enables efficient per tenant time range queries (SELECT * FROM events WHERE tenant_id = 'acme' AND timestamp BETWEEN t1 AND t2) without scatter gather, while tenants remain evenly distributed across the cluster.

Apache Cassandra and Amazon DynamoDB both implement this pattern. Cassandra uses a compound primary key: the first part is the partition key (hashed via Murmur3), and the remaining columns form the clustering key (sorted on disk within the partition). DynamoDB similarly hashes the partition key and optionally sorts items by a sort key within each partition, enabling efficient Query operations on partition key plus sort key ranges. The trade off is that range scans are limited to within a single partition: you can efficiently query one tenant's time series data but cannot perform a global time range scan across all tenants without hitting every partition. This is acceptable for applications with natural partitioning boundaries like tenant_id, user_id, or device_id where queries are almost always scoped to a single entity.

Implementation requires careful schema design. Choose the partition key component based on distribution and access patterns: it should have high cardinality (many distinct values) to spread load, and queries should almost always include it to avoid scatter gather. Choose the clustering key component based on range query needs: typically time, sequence number, or lexicographically ordered identifier. Monitor partition size carefully because all items for a partition key colocate; a single tenant accumulating terabytes of data will create a storage hotspot. Set per tenant quotas or use time based partition key rotation (for example, tenant_id plus month) to bound partition size. Measure query fan out: if you find many queries omitting the partition key and scanning all partitions, reconsider your partitioning scheme or add a global secondary index with a different partition key optimized for those queries.

💡 Key Takeaways

✓Hash first component for distribution, sort second for range scans: Use tenant_id as hashed partition key and timestamp as clustering key. Enables efficient per tenant time range queries without scatter gather across partitions.

✓Range scans limited to single partition scope: You can query one tenant's time series efficiently but cannot perform global time range scan across all tenants without hitting every partition. Acceptable when queries naturally scope to partition key.

✓Choose partition key for high cardinality and query inclusion: Partition key should have many distinct values (tenant_id, user_id, device_id) and appear in nearly all queries. Low cardinality (like country_code with 200 values) limits parallelism.

✓Monitor per partition size to avoid storage hotspots: All items for one partition key colocate. A tenant accumulating terabytes creates storage and throughput hotspot. Set per tenant quotas or rotate partition key monthly (tenant_id plus month).

✓Cassandra and DynamoDB implement this pattern natively: Cassandra uses compound primary key with first part hashed, remainder sorted on disk. DynamoDB hashes partition key and optionally sorts by sort key within partition for efficient Query API.

✓Measure query fan out for schema validation: If many queries omit partition key and scan all partitions, your partitioning scheme is wrong. Add global secondary index with different partition key or redesign access patterns.

📌 Interview Tips

1Multi tenant event logging: Use (tenant_id, timestamp) as composite key. Hash tenant_id to distribute tenants. Sort by timestamp within partition. Query SELECT * FROM events WHERE tenant_id = 'acme' AND timestamp > t efficiently scans one partition. Query across all tenants requires scatter gather.

2Apache Cassandra schema: CREATE TABLE events (tenant_id text, timestamp timestamp, data text, PRIMARY KEY ((tenant_id), timestamp)) WITH CLUSTERING ORDER BY (timestamp DESC). Partition key tenant_id is hashed via Murmur3. Clustering key timestamp is sorted descending on disk. Per tenant queries use single partition read.

3Amazon DynamoDB table: Partition key user_id (hashed), sort key timestamp. Query API with KeyConditionExpression user_id = '12345' AND timestamp BETWEEN t1 AND t2 scans single partition. Scan across all users requires Scan operation hitting all partitions with filters applied post retrieval.

4Time based partition key rotation: Use (tenant_id_month, timestamp) where tenant_id_month = tenant_id concatenated with YYYY_MM. Each month creates new partition, bounding partition size. Old partitions archived or deleted after retention period. Queries spanning months require multi partition query.

← Back to Hash-based Partitioning Overview