Series Cardinality: The Hidden Killer of Time Series Database Performance

What Is Series Cardinality
Series cardinality is the number of unique tag combinations in your time series data. Each unique combination creates a separate series requiring its own in-memory index entry. With tags for endpoint (100 values), method (5), status_code (10), and region (4), you have 100 x 5 x 10 x 4 = 20,000 unique series.
Now imagine adding user_id as a tag for debugging. With 1 million active users, cardinality jumps to 20 billion series. At even 1KB metadata per series, that is 20TB of RAM just for indices. This is the #1 cause of production TSDB failures.
Why Memory Fails First
Memory costs 100-1000x more per gigabyte than disk. In-memory indices grow with cardinality while disk grows with data volume. A system ingesting 1 million points/second might have plenty of disk headroom but OOM (out-of-memory) crash from 10 million unique series consuming 10GB+ of RAM for metadata alone.
Symptoms and Detection
High cardinality manifests as: OOM kills, ingestion timeouts exceeding SLOs (Service Level Objectives, your latency/availability targets), compaction backlogs as the system struggles to merge thousands of small files, and sudden query degradation. Monitor series count growth rate and alert when cardinality increases faster than expected.
Mitigation Strategies
Tag governance: Whitelist allowed tag keys, ban high-cardinality fields (UUIDs, request IDs, email addresses). Cardinality quotas: Per-tenant and per-metric limits with circuit breakers rejecting writes exceeding limits. Top-K aggregation: Preserve top 100 dimensions while rolling up rare combinations into "other" bucket. Fields vs tags: Move high-cardinality values to unindexed fields when you do not need to filter by them.

💡 Key Takeaways

✓Series cardinality = product of all unique tag values; adding user_id (1M users) to 20K existing series = 20 billion series requiring 20TB RAM

✓Memory costs 100-1000x more per GB than disk; in-memory index growth from cardinality hits limits before disk capacity

✓Symptoms: OOM kills, ingestion timeouts, compaction backlogs merging thousands of small files, sudden query degradation

✓Tag governance: whitelist allowed keys, ban high-cardinality fields (UUIDs, request IDs, email addresses)

✓Top-K aggregation preserves top N dimensions while rolling up rare combinations into "other" bucket

✓Move high-cardinality values to unindexed fields when filtering is not needed (e.g., request_id for debugging only)

📌 Interview Tips

1Calculate cardinality explosion: 100 endpoints x 5 methods x 10 status codes x 4 regions x 1M users = 20 billion series. Remove user_id tag = back to 20K.

2Debug OOM incident: series count growing 100K/day from container IDs as tags. Each container restart creates new series. Solution: use container_name (stable) not container_id (ephemeral).

3Implement circuit breaker: when tenant exceeds 100K series quota, reject new tag combinations while existing series continue writing for partial availability.

← Back to Time-Series Databases (InfluxDB, TimescaleDB) Overview