Message Queues & Streaming • Dead Letter Queues & Error HandlingMedium⏱️ ~3 min
Multi Tenant DLQs and Per Consumer Isolation Patterns
The Anti-Pattern:
In pub-sub with multiple consumers reading shared topics, a single shared DLQ is dangerous. If Consumer A (order service) and Consumer B (analytics) both dead-letter to the same queue, you cannot safely redrive—messages would be delivered to both consumers again, causing duplicate side effects in the wrong service.
Redrive Mechanics:
You cannot redrive from subscription DLQ back to shared source topic—other subscriptions would receive those messages. Instead, create a dedicated retry topic or queue that only the affected consumer reads from. Amazon architectures implement separate retry queues per consumer group, rate-limited independently.
Heterogeneous Failure Handling:
Per-consumer DLQs enable different policies: order service might retry 10 times over 5 minutes before dead-lettering, while analytics pipeline tolerates data loss and drops messages after 2 attempts to avoid backlog.
Schema Evolution Benefits:
When you deploy a new message version, only consumers that haven't upgraded fail validation and populate their DLQs, while upgraded consumers process successfully. This allows gradual rollout without coordinated downtime.
❗ Remember: Per-Consumer DLQs
Attach DLQs to individual subscriptions, not topics. Microsoft Azure Service Bus and Google Pub/Sub both implement per-subscription dead letter queues. This enables Consumer A to redrive without affecting Consumer B, and allows different retry policies per consumer.
💡 Key Takeaways
✓Per consumer DLQs attach to subscriptions not topics, enabling isolated redrive without affecting other consumers in multi tenant pub sub architectures
✓Microsoft Azure and Google Pub/Sub implement per subscription dead letter queues, allowing heterogeneous retry policies and retention per consumer
✓Redrive from subscription DLQ must go to consumer specific retry queue, never back to shared source topic which would deliver to all subscriptions
✓Heterogeneous failure handling allows order service to retry 10 times over 5 minutes while analytics pipeline drops after 2 attempts to avoid backlog
✓Schema evolution becomes gradual: only consumers lacking new field validation populate DLQs, upgraded consumers process successfully during rollout
✓Amazon architectures use separate retry queue per consumer group with independent rate limiting feeding back into that consumer's pipeline
📌 Interview Tips
1E-commerce system with order service and analytics reading shared topic: order service DLQ uses 10 retry attempts and 24 hour retention, analytics uses 2 attempts and 1 hour retention based on different criticality
2During schema migration adding required customer_id field, legacy analytics consumers dead letter 30 percent of messages while upgraded order service processes 100 percent, enabling phased rollout over 3 days
3Google Pub/Sub customer creates dedicated retry subscription for order service reading from its DLQ topic at 100 messages per second, isolated from analytics subscription processing at full speed