Multi Tenancy Patterns and Noisy Neighbor Isolation
Multi Tenancy Architecture Choices
Search systems serving multiple tenants (customers, teams, or business units that share infrastructure) must choose between isolation and efficiency. Index per tenant gives each tenant dedicated shards with independent lifecycle management and zero risk of one tenant affecting others. Shared index with routing packs many tenants into common indexes, routing documents and queries by tenant ID to specific shards. Each pattern trades off operational complexity, resource isolation, and cost at different tenant scales.
The critical decision factor is tenant size distribution. Large tenants generating more than 10 GB data or more than 100 queries per second justify dedicated indexes with custom tuning. Small tenants benefit from shared indexes that amortize overhead. Most production systems use hybrid approaches: dedicated indexes for large tenants, shared indexes with routing for the long tail of small tenants.
Index per Tenant Trade offs
Index per tenant provides strong isolation: each tenant has independent shards, schema control, and lifecycle policies. If one tenant needs different analyzers, retention periods, or replica counts, their index is customized without affecting others. Deleting a churned tenant means dropping their index, which is instant and clean. Performance issues from one tenant (expensive queries, unusual traffic patterns) affect only their shards.
However, index per tenant creates cluster state bloat with many tenants. Each index requires metadata tracked by the cluster master, and each shard consumes heap memory. A cluster with 5,000 tenants and 5 shards per tenant reaches 25,000 total shards, overwhelming coordination. Cluster state (the metadata tracking all indexes, shards, and mappings) can grow to gigabytes, causing slow cluster updates and elevated master node CPU. This pattern works for hundreds of medium-to-large tenants, not thousands of small ones.
Shared Index with Routing
Shared indexes pack many small tenants into common shards, using a routing field (typically tenant ID) to direct documents and queries to specific shards. This dramatically reduces total shard count: 5,000 tenants share perhaps 50 shards instead of 25,000. Query routing ensures tenant A queries only scan shards containing tenant A documents, maintaining logical isolation despite physical co-location.
Shared indexes require strict governance to prevent mapping explosions: when tenants can add arbitrary fields, the mapping (the schema defining field names and types) grows unbounded, inflating cluster state (the central metadata) and increasing heap usage on all nodes. Prevention requires enforced schemas, field name limits, and dynamic mapping disabled. Without governance, a single tenant adding thousands of unique field names degrades the entire cluster.
Noisy Neighbor Isolation
The noisy neighbor problem occurs when one tenant running expensive queries degrades performance for co-located tenants. A tenant executing wildcard queries (queries using patterns like *substring* that require full index scans) or unbounded aggregations can spike shard latency from 50ms to 5 seconds, affecting all tenants on those shards. p99 latency (the 99th percentile, meaning 99% of requests complete faster) for innocent tenants spikes because they share shards with the problematic tenant.
Mitigation requires per-tenant resource controls: query timeouts (typically 100 to 300ms) that terminate expensive queries before they accumulate damage, concurrency limits (such as maximum 10 concurrent queries per tenant) that prevent queue flooding, and admission control that rejects new queries when per-tenant queue depth exceeds thresholds. Some systems implement tenant-aware scheduling that deprioritizes known-expensive tenants or moves them to dedicated nodes.