Database DesignSearch Databases (Elasticsearch, Solr)Hard⏱️ ~3 min

Multi Tenancy Patterns and Noisy Neighbor Isolation

Search systems serving multiple tenants (customers, teams, or business units) must choose between index per tenant for isolation or shared index with routing for efficiency. Each pattern trades off operational complexity, resource isolation, and cost at different tenant scales. Index per tenant provides the strongest isolation: each tenant gets dedicated shards, independent lifecycle management, and no risk of one tenant's schema changes or query patterns affecting others. This works well for large tenants (such as those generating more than 10 GB data or more than 100 QPS), but creates cluster state bloat and over sharding with thousands of small tenants. A cluster with 5,000 tenants and 5 shards per tenant reaches 25,000 total shards, overwhelming coordination and metadata management. Shared index with routing packs many small tenants into a single index, routing documents and queries by tenant ID to specific shards. This dramatically reduces shard count and enables efficient multi tenant serving, but requires strict field governance to prevent mapping explosions (where each tenant introducing new fields inflates cluster state) and query budgets to prevent noisy neighbor effects. One tenant running expensive wildcard queries can degrade latency for all co located tenants. Uber demonstrates hybrid multi tenancy for marketplace search: large cities get dedicated indices with custom tuning (New York, Los Angeles, London), while smaller markets share indices with routing by geographic region. Each tenant has enforced query timeouts (typically 100 to 300ms), concurrency limits (such as max 10 concurrent queries per tenant), and admission control that rejects queries when per tenant queue depth exceeds thresholds. LinkedIn uses verticalized indices (people, jobs, content) where each vertical is effectively a large tenant with dedicated infrastructure and Service Level Objectives (SLOs).
💡 Key Takeaways
Index per tenant gives strong isolation and independent lifecycle for large tenants (more than 10 GB or more than 100 QPS) but creates over sharding with thousands of small tenants
Shared index with routing reduces shard count by packing small tenants together, but requires strict field governance and query budgets to prevent noisy neighbors
Mapping explosions occur when tenants dynamically add fields in shared index, inflating cluster state and increasing heap usage on all nodes
Noisy neighbor problem: one tenant running expensive queries degrades latency for all co located tenants without per tenant query limits and admission control
Uber hybrid approach: large cities get dedicated indices with custom tuning, small markets share indices with routing by region plus per tenant query timeouts and concurrency caps
Per tenant query budgets include timeouts (100 to 300ms), concurrency limits (max 10 concurrent), and admission control rejecting queries when queue depth exceeds threshold
📌 Examples
Uber marketplace: New York gets dedicated index with 20 shards and custom analyzers, while 50 smaller US cities share an index routed by city ID with 10 shards
Over sharding scenario: 5,000 small tenants with 5 shards each creates 25,000 shards overwhelming cluster coordination vs shared index with routing uses 50 shards total
Noisy neighbor: Tenant A runs unbounded wildcard query taking 5 seconds, all tenants on same shard see p99 spike from 50ms to 5 seconds until timeout enforced
← Back to Search Databases (Elasticsearch, Solr) Overview
Multi Tenancy Patterns and Noisy Neighbor Isolation | Search Databases (Elasticsearch, Solr) - System Overflow