Networking & Protocols • CDN Architecture & Edge ComputingHard⏱️ ~3 min
How Do Pull vs Push CDN Models and Configuration Propagation Trade Offs Impact Production Systems?
Pull CDNs fetch content from origin on demand when a cache miss occurs, while push CDNs require operators to proactively upload content to edge PoPs. Pull models dominate modern CDN architectures because they simplify operations for frequently changing content. The first user request experiences a cache miss with higher latency (full round trip to origin), but subsequent requests hit the edge cache and serve at 10 to 25 ms. Pull CDNs risk origin load spikes when popular uncached content arrives, mitigated by request collapsing and origin shielding. Push models offer the lowest first byte latency and predictable availability because content preloads to all PoPs before user requests arrive. However, push increases operational overhead for content lifecycle management, requiring coordination to upload updates across hundreds of PoPs and remove stale versions. Push works well for infrequently changing, mission critical assets like software binaries or live event streams where every request must hit cache.
Configuration propagation for cache policies, routing rules, and Web Application Firewall (WAF) settings uses globally replicated control planes, typically implemented with distributed message buses similar to Kafka. These systems accept eventual consistency because achieving strong consistency across 300+ globally distributed PoPs would require complex coordination protocols that harm availability and add latency. In practice, configuration changes propagate within seconds to low tens of seconds. During propagation windows, different PoPs may enforce different rules, creating temporary inconsistencies. For example, a WAF rule update might block malicious requests at European PoPs while Asian PoPs still allow them for a few seconds. Production systems monitor propagation lag per region, expose SLAs around update latency, and design rules to fail safe (default deny for security policies).
The trade off between TTL based expiration and explicit purges mirrors the pull versus push debate. TTL expiration scales effortlessly because each PoP independently decides when to refetch content, requiring no global coordination. However, TTL expiration risks serving stale content until expiry, which may be unacceptable for price changes, security patches, or content policy violations. Explicit purges provide precise control, invalidating content across all PoPs within seconds. Yet purges require robust, low latency propagation infrastructure and still face eventual consistency windows. Versioned URLs sidestep this dilemma entirely for immutable content: the old version remains cached while new requests fetch the new version, achieving instant updates with zero purge overhead. For dynamic content that cannot be versioned, short TTLs plus background revalidation and purges for emergencies balance freshness, performance, and operational complexity.
💡 Key Takeaways
•Pull CDNs fetch on demand (first request pays cache miss penalty, subsequent requests hit edge at 10 to 25 ms); push CDNs preload content (lowest first byte latency but higher operational overhead for lifecycle management)
•Pull model risks origin load spikes on popular uncached content, mitigated by request collapsing (one origin fetch per PoP) and origin shielding (regional mid tier between edge and origin)
•Configuration propagation (cache policies, routing, WAF rules) uses distributed message buses with eventual consistency, typically completing globally within seconds but creating temporary mixed policy windows across PoPs
•TTL expiration scales without coordination but risks staleness; explicit purges provide precision but require low latency propagation and still face eventual consistency; versioned URLs eliminate trade off for immutable assets
•Production systems monitor per region propagation lag, expose SLAs around update latency (often low tens of seconds), and design policies to fail safe during propagation windows
📌 Examples
Amazon CloudFront uses a pull model with origin shielding. When an edge PoP in Tokyo misses cache, it queries a regional shield in Tokyo rather than directly fetching from a US origin. If the shield has the content, round trip is approximately 5 ms. If the shield also misses, it makes one origin request and multiple edge PoPs share that shielded response.
A video streaming service uses push CDN for live events. The encoder uploads video segments to a central location, and the CDN proactively replicates them to all 300 PoPs before viewers request them. This ensures every viewer experiences cache hits with predictable 10 to 15 ms latency even during the initial surge.
Microsoft Azure CDN propagates WAF rule updates through a global control plane. When a security team adds a rule to block a new attack pattern, the update reaches 80% of PoPs within 5 seconds and 99% within 15 seconds. During the propagation window, some PoPs still allow malicious requests, so the security team monitors attack traffic per region to verify full deployment.