Multi Tenant DLQs and Per Consumer Isolation Patterns

The Anti-Pattern:

In pub-sub with multiple consumers reading shared topics, a single shared DLQ is dangerous. If Consumer A (order service) and Consumer B (analytics) both dead-letter to the same queue, you cannot safely redrive—messages would be delivered to both consumers again, causing duplicate side effects in the wrong service.

❗ Remember: Per-Consumer DLQs
Attach DLQs to individual subscriptions, not topics. Microsoft Azure Service Bus and Google Pub/Sub both implement per-subscription dead letter queues. This enables Consumer A to redrive without affecting Consumer B, and allows different retry policies per consumer.
Redrive Mechanics:

You cannot redrive from subscription DLQ back to shared source topic—other subscriptions would receive those messages. Instead, create a dedicated retry topic or queue that only the affected consumer reads from. Amazon architectures implement separate retry queues per consumer group, rate-limited independently.

Heterogeneous Failure Handling:

Per-consumer DLQs enable different policies: order service might retry 10 times over 5 minutes before dead-lettering, while analytics pipeline tolerates data loss and drops messages after 2 attempts to avoid backlog.

Schema Evolution Benefits:

When you deploy a new message version, only consumers that haven't upgraded fail validation and populate their DLQs, while upgraded consumers process successfully. This allows gradual rollout without coordinated downtime.

💡 Key Takeaways

✓Per consumer DLQs attach to subscriptions not topics, enabling isolated redrive without affecting other consumers in multi tenant pub sub architectures

✓Microsoft Azure and Google Pub/Sub implement per subscription dead letter queues, allowing heterogeneous retry policies and retention per consumer

✓Redrive from subscription DLQ must go to consumer specific retry queue, never back to shared source topic which would deliver to all subscriptions

✓Heterogeneous failure handling allows order service to retry 10 times over 5 minutes while analytics pipeline drops after 2 attempts to avoid backlog

✓Schema evolution becomes gradual: only consumers lacking new field validation populate DLQs, upgraded consumers process successfully during rollout

✓Amazon architectures use separate retry queue per consumer group with independent rate limiting feeding back into that consumer's pipeline

📌 Interview Tips

1E-commerce system with order service and analytics reading shared topic: order service DLQ uses 10 retry attempts and 24 hour retention, analytics uses 2 attempts and 1 hour retention based on different criticality

2During schema migration adding required customer_id field, legacy analytics consumers dead letter 30 percent of messages while upgraded order service processes 100 percent, enabling phased rollout over 3 days

3Google Pub/Sub customer creates dedicated retry subscription for order service reading from its DLQ topic at 100 messages per second, isolated from analytics subscription processing at full speed

← Back to Dead Letter Queues & Error Handling Overview