Read Heavy Optimization: Precomputation and Locality
Read Optimization Principles
Read optimization centers on two principles: move data closer to users and compute expensive operations ahead of time. This minimizes work on the critical read path, trading storage and staleness for dramatically lower latency and cost per read. A well-optimized read-heavy system serves 95%+ of requests from cache with < 10 ms latency.
Multi-Layer Caching
Locality optimization employs multiple cache layers. A typical architecture flows: browser cache → CDN (Content Delivery Network, servers at the network edge) → regional cache → origin database. Each layer absorbs traffic before it reaches the next. A single in-memory cache node can handle 1-2 million operations/sec. CDNs serve static content with < 10 ms p50 latency by serving from memory at the edge. Origin fallbacks push p99 to tens of ms, but cache hit rates above 95% keep this rare.
Precomputation and Materialized Views
Precomputation avoids expensive joins and aggregations on the hot path. Instead of joining 5 tables to render a page, precompute and store the combined result. A timeline feature can use fan-out-on-write: when a user posts, immediately write to each follower inbox rather than computing the timeline on read. This converts a complex multi-table query into a simple range scan. Update materialized views synchronously for small data sets, asynchronously via change stream consumers for large fan-out.
Trade-offs of Read Optimization
The costs: cache invalidation complexity, replication lag causing staleness (typically seconds to minutes), and write amplification. Each additional index or materialized view adds 1-3x write overhead. Monitor replication lag using LSN (Log Sequence Number, the position in the write-ahead log). Implement read-after-write consistency when needed by routing the writer to the primary for a brief period or requiring replicas to catch up before serving.