Failure Modes: Backfills, Hot Keys, and Drift in Denormalized Systems

Schema Evolution Challenge
Adding a field to denormalized data requires backfilling existing rows. A new discount_percentage field on 100 million product rows takes hours to backfill. During backfill, some rows have the field, others do not. Application code must handle both. Strategy: deploy code that handles missing field gracefully, backfill incrementally (batches of 10,000 rows with pauses to avoid overwhelming the database), verify all rows updated, then make the field required. Normalized schemas have one row to update; denormalized schemas have millions.
Storage Cost Multiplication
Denormalization multiplies storage. A normalized product catalog might be 5 GB. Denormalizing into a search index, a recommendation cache, a listing cache, and a mobile API cache could total 25+ GB, a 5x increase. At cloud storage rates of $20-50/TB/month, this adds $500-1,000/month per TB of source data. Factor this into architecture decisions. The justification is usually compute savings: avoiding expensive joins saves more than the storage costs.
Debugging Stale Data
When users report seeing incorrect data, is the source wrong or is propagation lagging? Build observability: log event publish timestamps, consumer processing timestamps, and current source values alongside denormalized values. When investigating, query both source and denormalized stores with the same ID and compare. If source shows $15 but denormalized shows $10, the issue is propagation lag or a stuck consumer. Include version numbers to determine when the denormalized copy was last updated.
Handling Deletes
Deleting source data must propagate to all denormalized copies. If a product is discontinued but its denormalized listing remains, users see ghost products. The propagation pattern: source delete publishes a tombstone event (a marker indicating deletion). Consumers remove the denormalized row. If any consumer misses the tombstone, the ghost persists. Defensive design: include is_active flags rather than hard deletes, run periodic reconciliation jobs that compare source and denormalized stores and remove orphans.

💡 Key Takeaways

✓Backfill scale risk: recomputing 100 billion rows at 50,000 rows per second takes 23 days single threaded; 1,000 way parallelism reduces to 33 minutes but I/O burst can evict cache and spike latency; throttle to 10% cluster capacity and process in rolling keyspace windows

✓Schema evolution via dual write: add new fields behind feature flag, write both schemas for days to weeks, validate parity, flip reads, backfill incrementally in 1 million row batches with rate limits; monitor cache hit rate drop (5% degradation can breach SLOs)

✓Hot key mitigation through sharding: single viral post accessed millions of times per second overwhelms partition; shard counters into 64 to 128 buckets with random routing distributes writes; reads sum shards adding latency but preventing hotspot

✓Counter drift from dropped or duplicate increments causes several percent divergence on long lived objects without reconciliation; periodic jobs every 10 to 60 minutes scan and recompute from event log as source of truth

✓Delete tombstones must propagate to all projections: missed tombstones create ghost items in feeds; ensure idempotent retryable deletes with time to live; cross region lag of 100 to 300 milliseconds means deleted items visible briefly in remote regions requiring optimistic UI hiding

📌 Interview Tips

1Meta feed backfill strategy: when adding new ranking feature to denormalized feed rows, feature flag enables dual write of old and new features for 2 weeks; backfill runs at 5% of cluster write capacity in 10 million row keyspace batches; cache hit rate monitored and backfill paused if hit rate drops below 92% to prevent latency spikes

2Pinterest viral pin hot key: single pin receiving 5 million views in 1 hour would overwhelm single counter partition; counter sharded into 128 buckets with random increment routing keeps per shard writes under 10,000 per second; read path sums 128 shards in parallel adding 5 milliseconds but preventing partition hotspot; approximate count using HyperLogLog considered for future with 1% error margin

← Back to Normalization vs Denormalization Overview