Database DesignChoosing Databases by Use CaseMedium⏱️ ~3 min

Database Selection Framework: Core Decision Factors

Choosing the right database requires evaluating four critical dimensions that directly impact your system's scalability, reliability, and cost. This framework helps you move beyond technology hype to make data driven decisions. First, analyze your data model. PostgreSQL excels when you need complex joins across normalized tables (like financial ledgers requiring referential integrity), while MongoDB shines for flexible schemas where each product in a catalog might have different attributes. Second, evaluate scale and performance. Redis delivers sub 1ms latency at over 1 million operations per second for session storage, whereas BigQuery handles petabyte scale analytical queries in seconds but with much higher latency. Third, consider consistency versus availability trade-offs. Stripe uses PostgreSQL for payment transactions specifically because ACID (Atomicity, Consistency, Isolation, Durability) guarantees prevent double charges, accepting 10 to 100ms write latency. Netflix uses Cassandra for viewing history, choosing eventual consistency and 1 to 10ms writes to handle 2.5 trillion writes per day across global regions. The CAP (Consistency, Availability, Partition tolerance) theorem forces you to choose: strong consistency with potential unavailability during network partitions, or high availability with temporary inconsistency. Operational considerations often determine real world success. A startup using five different databases faces exponentially higher operational complexity than one using PostgreSQL with Redis for caching. Team expertise matters: migrating to Cassandra might unlock horizontal scalability, but if your team has never operated a distributed database, expect months of learning curve and potential outages. Cloud managed services like Amazon Aurora or DynamoDB trade higher costs ($200 to $500 monthly for moderate workloads) for reduced operational burden compared to self managed databases.
💡 Key Takeaways
Data model mismatch causes technical debt: forcing unstructured content into PostgreSQL requires expensive migrations later, while using MongoDB for financial transactions loses ACID guarantees you need
Performance numbers vary dramatically: Redis < 1ms latency versus PostgreSQL 5 to 50ms versus BigQuery 1 to 30 seconds, choosing wrong database adds latency you cannot optimize away
Consistency trade-offs are permanent: strong consistency adds 50ms+ for cross region coordination, eventual consistency risks showing stale data for seconds after writes, no configuration fixes this fundamental difference
Operational complexity scales exponentially: two databases require understanding their interaction patterns, five databases mean your team spends more time on database operations than feature development
Cloud managed services cost 2x to 3x more than self hosted but eliminate on call burden: Amazon Aurora costs $200 monthly versus $100 for self managed PostgreSQL on EC2, but includes automated backups, failover, and patching
📌 Examples
Uber evaluates databases per use case: MySQL for ride transactions (ACID required), Cassandra for trip history (high write throughput), Redis for real-time pricing (sub millisecond latency), Elasticsearch for location search (geospatial queries)
Discord migrated from MongoDB to Cassandra when message history exceeded 100 million users: MongoDB sharding became operationally complex, Cassandra's write optimized architecture handles append only messages at petabyte scale
Segment migrated from MongoDB to PostgreSQL despite scaling challenges because data integrity bugs from eventual consistency cost more than scaling effort, showing consistency requirements trump performance
← Back to Choosing Databases by Use Case Overview
Database Selection Framework: Core Decision Factors | Choosing Databases by Use Case - System Overflow