Database DesignIndexing StrategiesHard⏱️ ~3 min

How Do Hot Spots and Skew Break Indexing Performance?

Hot Spot Problem

Monotonically increasing keys (timestamps, auto-increment IDs, time-prefixed UUIDs) create hot spots. Every new insert goes to the same location: the rightmost leaf page in a B+ tree or a single partition in a distributed database. At 10K-50K inserts/sec, this single location becomes a bottleneck. Threads queue for access, inflating P99 (99th percentile latency) from single-digit ms to hundreds of ms.

Common Pitfall: Using timestamp or auto-increment as primary key concentrates all recent writes. Use random UUIDs, hash prefixes, or shard keys to spread writes.

Low Selectivity Problem

Low cardinality columns (columns with few distinct values, like status with only active/inactive) make poor leading index columns. With only 2 values, an index scan must read 50% of the index on average. Skewed distributions worsen this: if 99% of rows are status=active, an index on status provides almost no filtering benefit.

Solutions

Key salting: Prepend a random or computed prefix to spread writes across the index. Trade-off: point queries must check multiple prefixes.
Write sharding: Spread a hot key across multiple logical keys (counter_0 through counter_9), aggregate reads from all shards.
Partial indexes: Index only rows matching a condition (WHERE status = pending), reducing index size for low-selectivity columns.

Detection

Monitor write latency distribution. If P99 is 10x+ higher than P50 (median), hot spots likely exist. Check partition/page access patterns to identify the hot location.

💡 Key Takeaways
Monotonic keys (timestamps, auto-increment) create hot spots: all inserts go to same location
Hot spots inflate P99 from single-digit ms to hundreds of ms as threads queue for access
Low selectivity columns (status with 2 values) provide little filtering benefit as leading index column
Solutions: key salting (random prefix), write sharding (multiple logical keys), partial indexes
📌 Interview Tips
1Hot spot detection: P99 is 200ms while P50 is 5ms suggests hot spot; check which partition/page is overloaded
2Key salting: prepend hash(user_id) % 10 to timestamp key, spreading writes across 10 locations
3Partial index: CREATE INDEX idx ON orders(created_at) WHERE status = pending indexes only pending orders
← Back to Indexing Strategies Overview