Pull vs Push: Discovery Update Distribution Strategies
Pull Based Discovery
Clients periodically poll the registry for updates. Simple to implement: clients fetch the full service list every 5-30 seconds and cache locally. Trade off is staleness: a new instance is not discoverable until the next poll. With 1000 clients polling every 10 seconds, the registry handles 100 requests/second. Scaling clients linearly increases registry load.
Push Based Discovery
The registry notifies clients when service topology changes. Clients maintain a connection (WebSocket, gRPC stream, or long polling) to receive updates. Updates propagate in milliseconds instead of waiting for poll intervals. Registry load depends on change frequency, not client count. However, the registry must track all client connections, consuming memory and connection resources.
Watch Mechanisms
Many registries support watches: clients register interest in specific services and receive notifications only for those services. This reduces bandwidth compared to receiving all updates. The registry maintains watch state per client per service. When service A changes, only clients watching A receive notifications, not clients only interested in service B.
Hybrid Approaches
Production systems often combine both. Push for immediate notification of changes, with periodic pull as backup to catch missed updates. The pull interval can be long (5 minutes) since push handles normal updates. This provides fast propagation with eventual consistency guarantees even if push connections fail.
Propagation Delays
Even with push, propagation is not instant. A service instance fails, health check detects it (10s), registry updates state, push notifications fan out (100ms), clients process updates. Total propagation can be 10-30 seconds. During this window, clients may route to failed instances. Client side retries and circuit breakers handle this gap.