Resilience & Service PatternsService DiscoveryMedium⏱️ ~2 min

Pull vs Push: Discovery Update Distribution Strategies

Service registries must distribute endpoint updates to clients or load balancers. Pull based systems have clients periodically poll for changes. DNS is the classic example: you query a service name, get IP addresses back, and cache them according to a Time To Live (TTL) value. Google commonly uses 5 to 30 second TTLs for internal DNS. Kubernetes CoreDNS serves endpoint lists with similar TTLs. The client controls refresh timing. Push based systems stream updates to clients. When an instance registers or fails a health check, the registry immediately pushes the change to all watchers. Google's xDS style configuration streams and Kubernetes watch APIs exemplify this. Changes propagate in under 1 second in steady state. Clients maintain persistent connections to receive real time updates. Pull is simple and universally supported. DNS works everywhere without special client libraries. Caching is straightforward with TTL semantics. However, reaction time is bounded by the polling interval. With a 30 second TTL, clients might send traffic to a dead instance for up to 30 seconds after failure. Reducing TTL to 5 seconds increases registry load by 6x. If you have 1,000 clients polling 1,000 services every 10 seconds, that's 100,000 queries per second, a classic N times M polling storm. Push enables near real time updates but requires persistent connections. If you have 10,000 clients watching 500 services, that's 5 million watch connections to maintain. Connection management, backpressure handling, and graceful reconnection become complex. During a registry restart, all clients reconnect simultaneously, creating a thundering herd. Push systems must implement stream heartbeats, sequence numbers for ordering, and careful rollout procedures to avoid overwhelming the registry with connection storms.
💡 Key Takeaways
Pull based discovery with DNS TTL of 30 seconds means clients can route to failed instances for up to 30 seconds, causing user visible errors until cache expires
Push based systems propagate changes in under 1 second but require managing persistent connections; 10,000 clients watching 500 services creates 5 million connections
Reducing DNS TTL from 30 to 5 seconds increases registry query load by 6x, risking overload; must balance freshness against infrastructure capacity
Push systems risk thundering herds during registry restarts when all clients reconnect simultaneously; implement jittered reconnection backoff and connection rate limits
Polling storms occur with N clients times M services; 1,000 clients polling 1,000 services every 10 seconds generates 100,000 queries per second without fanout optimization
📌 Examples
Google uses DNS with 5 to 30 second TTLs for broad compatibility plus streaming xDS APIs for sub second config updates on critical routing paths
Kubernetes watch API streams endpoint changes with sequence numbers ensuring ordered updates, clients reconnect with last seen version to resume without missing changes
Netflix Eureka clients poll every 30 seconds with client side caching, accepting eventual consistency to maintain high availability during registry partitions
← Back to Service Discovery Overview