Database DesignSearch Databases (Elasticsearch, Solr)Medium⏱️ ~3 min

Time Based Indexing and Hot Warm Cold Tiering

Production systems handling logs, events, or time series data partition indexes by time buckets (daily or weekly) and apply lifecycle policies that move data through hot, warm, and cold tiers based on access patterns and retention needs. This pattern dramatically reduces cost while maintaining fast query performance for recent data. Hot tier uses NVMe or SSD storage with full replica coverage for active writes and low latency queries (typically sub 100ms p99). After a time window (often 7 to 30 days), indices roll to warm tier with fewer replicas on cheaper HDD storage, accepting higher latency (200 to 500ms) for infrequent queries. Cold tier uses object storage like AWS Simple Storage Service (S3) or snapshot repositories for long term retention, accessible only through restore operations. Netflix demonstrates this pattern at massive scale, ingesting billions of events per day into time partitioned Elasticsearch indices. Their architecture uses streaming ingestion from Kafka, automatic rollover when indices hit size or age thresholds, then lifecycle policies that transition indices through tiers. Interactive dashboards and security detections query hot data with sub second to few second latencies, while historical investigations query warm or restored cold data with relaxed Service Level Objectives (SLOs). The critical tradeoff is balancing index size with shard count. Too many small indices create over sharding (hundreds of tiny shards per node waste heap and reduce cache efficiency), while too few large indices slow down rollovers, snapshots, and recovery. Best practice is rollover by both size (20 to 50 GB) and time to ensure indices stay within operational bounds regardless of event rate variability.
💡 Key Takeaways
Time based indices partition data by day or week, enabling efficient lifecycle management and targeted queries to relevant time ranges
Hot tier (0 to 7 days) uses NVMe with replicas for sub 100ms queries at higher cost ($500/TB/month), warm tier (8 to 90 days) uses HDD with fewer replicas for 200 to 500ms queries ($100/TB/month)
Cold tier uses object storage snapshots for long term retention at lowest cost ($25/TB/month) but requires restore operation before querying
Rollover by both size (20 to 50 GB target) and time prevents over sharding from low traffic periods and oversized indices from traffic spikes
Netflix ingests billions of events daily into time partitioned indices, serving interactive queries in sub second to few seconds while archiving to cold tier
Over sharding (hundreds of small shards per node) wastes heap memory and degrades cache locality, slowing queries despite spare cluster capacity
📌 Examples
Netflix observability: Streams events to Elasticsearch, rolls over daily indices at 30 GB, keeps 7 days hot on SSD, 90 days warm on HDD, cold archives to S3
Log aggregation pattern: Index logs-2024-01-15 on hot tier, automatically transitions to warm after 7 days, then snapshots to S3 and deletes after 90 days
Cost optimization: 10 TB hot data at $500/TB = $5,000/month vs moving 8 TB to warm at $100/TB saves $3,200/month with acceptable latency tradeoff
← Back to Search Databases (Elasticsearch, Solr) Overview
Time Based Indexing and Hot Warm Cold Tiering | Search Databases (Elasticsearch, Solr) - System Overflow