Learn→Message Queues & Streaming→Notification System Design (Push Notifications)→3 of 6

Message Queues & Streaming • Notification System Design (Push Notifications)Hard⏱️ ~3 min

Priority Lane Isolation and Throughput Planning

The Problem:

Shared notification pipelines create a critical failure mode: bulk promotional campaigns can starve time-sensitive traffic like OTPs or fraud alerts.

Priority Lane Solution:

Use separate message queue topics and dedicated worker pools. For 10K/sec peak, allocate: High priority (OTP, fraud) 10% of traffic, Medium (transactional) 30%, Low (marketing) 60%. Each lane gets dedicated consumers sized for its peak independently.

Worker Capacity Math
1 worker ≈ 100 notif/sec → 10K/sec needs ~100 workers
With priority lanes: high=34 workers, medium=34, low=67. Over-provision high-priority by 50% for headroom.
Batching Trade-off:

Batching 10 notifications before API calls drops worker count from 100 to ~10 (network round-trips dominate). Trade-off: batches wait for fill or timeout (50-200ms). High-priority lanes often disable batching for sub-100ms processing; low-priority batches aggressively.

AWS Limits:

SNS Standard: ~30K msg/sec for fanout. FIFO: 3K/sec (ordering guarantees). SQS Standard: unlimited horizontal scale. SQS FIFO: reduced throughput but preserves order. For 10K/sec with per-user ordering, partition users into 256 message groups (~40/sec per group).

💡 Key Takeaways

✓Worker capacity planning: a single worker processes approximately 100 notifications per second with batched APIs. For 10,000 per second peak, provision roughly 100 workers, or partition into priority lanes with independent sizing (34 high, 34 medium, 67 low).

✓Batching reduces worker count by 10x (from 100 to 10 workers for same throughput) by amortizing network round trips, but adds 50 to 200 millisecond latency waiting for batch fill or timeout. Disable batching for high priority lanes needing sub 100 millisecond processing.

✓Amazon Web Services (AWS) Simple Notification Service (SNS) Standard topics sustain 30,000 messages per second, but First In First Out (FIFO) drops to 3,000 per second with ordering. Partition users into 256 message groups to get 10,000 per second with per user ordering at roughly 40 per second per group.

✓Over provision high priority worker pools by 50% to absorb spikes when low priority campaigns saturate shared infrastructure like caches or databases. Monitor queue depth and age: alert if high priority exceeds 5 to 10 seconds.

✓Failure mode: without lane isolation, a single bulk campaign sending 1 million notifications can backlog the entire system for minutes, causing one time passwords to time out and fraud alerts to miss Service Level Objectives (SLOs). Separate topics prevent this cascading failure.

📌 Interview Tips

1Banking notification system with 1 million Daily Active Users (DAU): high priority lane handles 10,000 one time passwords per day (average 0.1 per second, peak 10 per second during login storms) with 2 workers and zero batching for sub 200 millisecond latency.

2E-commerce platform during Black Friday: low priority marketing lane hits 50,000 per second peak (500 workers with batch size 10), while high priority order confirmation lane maintains 1,000 per second (10 dedicated workers, batch size 1) with isolated capacity.

3Ride sharing app separating driver arrival notifications (high priority, 5,000 per second peak, 50 workers, no batching) from promotional offers (low priority, 20,000 per second, 200 workers, batch 10) to guarantee sub 1 second driver notifications even during campaign blasts.

← Back to Notification System Design (Push Notifications) Overview