What Are Time Series Databases and How Do They Differ from Traditional Databases?

Definition
Time-Series Databases (TSDBs) optimize specifically for data indexed by time: metrics, sensor readings, application logs. They assume append-heavy workloads where you write continuously but query specific time windows.
Why Specialized Architecture
Traditional databases balance reads and writes equally. TSDBs assume append-heavy patterns, enabling aggressive optimizations. Time-based partitioning groups data by time windows (hour, day, week) so recent data stays on fast storage while older data moves to cheaper tiers. Columnar compression exploits temporal locality (similar values clustered in time). Downsampling keeps high-resolution recent data while aggregating historical data (second-level to minute-level as data ages).
Performance Characteristics
Well-designed TSDBs ingest hundreds of thousands of data points per second per node while maintaining sub-second query latency for time-range aggregations. Time-based indexing means queries for "last hour of CPU metrics" instantly locate relevant partitions without scanning the entire dataset. This powers real-time dashboards and alerting systems.
Ideal Use Cases
TSDBs fit perfectly for: monitoring and observability (application metrics, infrastructure telemetry), IoT sensor data (temperature, pressure, location), financial tick data (price changes at millisecond granularity), and any workload where data flows in timestamped streams.
They struggle with: complex joins across entities, multi-dimensional queries not anchored on time, or frequent updates to historical data. A query asking "show all transactions for user X across all time" performs poorly if user_id is not the primary partition key.
Key Trade-off: TSDBs sacrifice flexibility (joins, ad-hoc queries, updates) for extreme efficiency on append-heavy, time-windowed access patterns. Choose based on whether time is your primary query dimension.

💡 Key Takeaways

✓Time-based partitioning groups data by time windows enabling partition pruning where queries read only relevant windows

✓Columnar compression achieves 8-59x reduction by exploiting temporal locality and specialized encodings (delta, dictionary, XOR)

✓Downsampling converts second-level data to minute-level as it ages, reducing storage while preserving trends for historical queries

✓Ingest rates reach hundreds of thousands of points per second per node with sub-second query latency on time-range aggregations

✓Memory becomes bottleneck before disk because in-memory indices for unique tag combinations grow faster than disk usage

✓Struggle with joins, non-time-anchored queries, and historical updates; ideal for append-only timestamped streams

📌 Interview Tips

1Explain time-based indexing: query for "last hour CPU" skips 99% of data via partition pruning. Same query in row store scans entire table.

2Describe the append-only assumption: inserts are 10-100x faster than updates because no index rebalancing or version management needed.

3Discuss downsampling tradeoff: keep 1-second resolution for 7 days (real-time alerting), roll up to 1-minute for 90 days (dashboards), 1-hour for years (trends).

← Back to Time-Series Databases (InfluxDB, TimescaleDB) Overview