Database DesignChoosing Databases by Use CaseMedium⏱️ ~3 min

Specialized Databases: Search, Graph, and Time Series Use Cases

Specialized databases solve specific problems that general purpose databases handle poorly. Using PostgreSQL for full text search or time series data works at small scale but becomes operationally expensive and slow as you grow. Elasticsearch dominates full text search by inverting the data model: instead of storing documents and scanning for keywords, it indexes every term and points to documents containing it. GitHub uses Elasticsearch for code search across millions of repositories, delivering sub 500ms search results with fuzzy matching, typo tolerance, and relevance ranking. The trade-off: write latency of 100 to 1000ms because every document must be analyzed, tokenized, and indexed. Elasticsearch also uses significant memory (recommend 1GB per 10GB of indexed data) and requires tuning relevance algorithms. You would not use Elasticsearch as your primary database because it lacks ACID guarantees and optimizes for search over transactional consistency. Graph databases like Neo4j excel at relationship traversal. LinkedIn uses Neo4j for "People You May Know" by traversing friend of friend connections in milliseconds. A query finding mutual connections three hops away takes under 10ms in Neo4j versus 5+ seconds in PostgreSQL requiring multiple self joins. The limitation: graph databases are overkill for simple relationships. If you only need to find direct connections (one hop), PostgreSQL with proper indexes suffices and avoids operational complexity of running Neo4j. Time series databases (InfluxDB, TimescaleDB) compress sequential data efficiently and optimize for time range queries. Uber uses InfluxDB for real time monitoring, ingesting 2 million metrics per second with automatic downsampling: keep raw data for 7 days, then aggregate to 1 minute buckets for 30 days, then 1 hour buckets forever. This compression reduces storage from petabytes to terabytes. Regular databases lack these built in retention policies and compression, forcing you to build them yourself.
💡 Key Takeaways
Search databases trade write speed for read flexibility: Elasticsearch 100 to 1000ms writes versus PostgreSQL 10 to 100ms, but Elasticsearch handles fuzzy matching and relevance ranking that would require complex PostgreSQL extensions
Graph traversal performance differs exponentially by depth: Neo4j handles 3 hop queries (friend of friend of friend) in 10ms, equivalent PostgreSQL query with self joins takes 5+ seconds at million user scale
Time series compression ratios reach 90%: InfluxDB compresses sequential metrics from petabytes to terabytes via delta encoding and automatic downsampling, PostgreSQL requires manual partitioning and custom compression logic
Operational complexity increases with specialized databases: running Elasticsearch cluster requires understanding shards, replicas, and index optimization, adding operational burden versus using PostgreSQL full text search for smaller datasets
Cost effectiveness depends on scale: Elasticsearch makes sense at 100GB+ of searchable content, below that PostgreSQL full text search (included free) handles 10,000 queries per second adequately
📌 Examples
Netflix uses Elasticsearch for content search: 230 million subscribers searching across titles, descriptions, actor names with typo tolerance, Netflix serves results in under 100ms by precomputing relevance scores and distributing index across 50+ nodes
eBay uses Neo4j for fraud detection: analyzing transaction networks to find suspicious patterns, traversing buyer seller product relationships 4 to 5 hops deep in real time, equivalent SQL query would timeout or require extensive denormalization
Tesla uses TimescaleDB (time series extension for PostgreSQL) for vehicle telemetry: millions of vehicles sending GPS coordinates and sensor readings every second, automatic data retention deletes raw data after 90 days while keeping aggregated hourly metrics forever
← Back to Choosing Databases by Use Case Overview
Specialized Databases: Search, Graph, and Time Series Use Cases | Choosing Databases by Use Case - System Overflow