CachingCDN CachingMedium⏱️ ~2 min

Push vs Pull CDN Caching Strategies

CDNs populate caches through two fundamental strategies: pull (on-demand) and push (pre-positioning). Pull caching is reactive. The first user request in a region for an object triggers a cache miss, the CDN fetches from origin, caches the response, and serves it. Subsequent requests in that region are cache hits until the object expires or is evicted. Pull is the default model for most web content because it requires no advance planning, automatically adapts to actual demand, and conserves cache storage by only storing what users request. The downside is cold start latency: the first request in each region suffers a full round-trip to origin, which might add 100 to 300 milliseconds compared to an edge-served request at 20 to 50 milliseconds. Push caching (also called content pre-positioning or origin push) proactively uploads content to CDN edge servers before user requests arrive. This eliminates cold starts for known popular content, ensuring even the first user in a region gets edge-served latency. Netflix Open Connect exemplifies push strategy at scale: Netflix pre-positions popular video segments to cache appliances inside ISPs based on viewing predictions, achieving Time to First Byte (TTFB) under 50 milliseconds globally and offloading 95%+ of traffic from transit links. Push is ideal for large, predictable, high-demand content like video-on-demand catalogs, software updates (Windows, gaming), and live event pre-rolls. However, push requires accurate demand forecasting, consumes storage at scale (potentially petabytes across a global CDN), and requires lifecycle management to purge or expire stale pre-positioned content. In practice, hybrid strategies dominate. A video streaming service might push the latest episode of a hit series to all regions before launch (eliminating cold start for the first wave of viewers) while relying on pull for long-tail catalog content viewed infrequently. Microsoft combines CDN push for popular Windows and Xbox updates with peer-assisted delivery, shaving 30% to 50% off CDN costs during major launches while maintaining sub-minute download start times. The tradeoff is operational complexity: push requires content distribution systems, accurate popularity models, and coordination between origin and CDN, whereas pull is operationally simple but less performant for predictable spikes.
💡 Key Takeaways
Pull caching (on-demand) fetches content on the first user request per region, adding 100 to 300 milliseconds cold start latency but requiring no advance planning or demand forecasting
Push caching (pre-positioning) uploads content to edges before user requests, eliminating cold starts and achieving sub-50 millisecond TTFB globally, but requires storage capacity planning and lifecycle management
Netflix Open Connect pre-positions popular video segments to ISP-embedded appliances, achieving 95%+ traffic served from push caches and reducing transit costs by an order of magnitude
Hybrid strategies are common: push for predictable high-demand content (new releases, game patches) and pull for long-tail content, balancing performance, cost, and operational complexity
Push models consume substantial edge storage (potentially petabytes globally) and require accurate popularity prediction; over-provisioning wastes capacity while under-provisioning causes cache eviction churn
Major software updates (50 to 100 GB) using push CDN combined with peer delivery reduce CDN egress costs by 30% to 50% during multi-Tbps launch spikes while maintaining rapid rollout
📌 Examples
Netflix pushes the next episode of a popular series to all global regions 24 hours before release; first viewers in each market see 30 millisecond TTFB instead of 250 millisecond cold-start fetch from origin data center
Microsoft Windows Update pre-positions cumulative update packages (2 to 5 GB) to CDN edges ahead of Patch Tuesday, ensuring millions of concurrent requests are served at 100+ Gbps aggregate without origin involvement
A live sports streaming service pushes pre-roll ads and opening segments 2 hours before kickoff, guaranteeing smooth start for the initial surge of viewers, then relies on pull for dynamic in-game highlights
← Back to CDN Caching Overview
Push vs Pull CDN Caching Strategies | CDN Caching - System Overflow