Blue Green and Canary Deployment Patterns for Model Rollout
Blue Green Deployment
Blue green runs old and new model stacks in parallel, then switches all traffic at once by flipping a load balancer or router configuration. The old stack stays warm for instant rollback if issues emerge. Netflix uses red black (their term for blue green) extensively: the flip takes seconds, and the old server group remains ready for immediate reversion. This pattern doubles compute during the transition window but provides the fastest rollback path. A typical blue green window at Netflix lasts 15 to 30 minutes before the old stack is torn down.
Canary Deployment
Canary gradually shifts traffic from 1 percent to 5 percent to 25 percent to 50 percent to 100 percent while monitoring SLOs and KPIs at each stage. Uber's typical path for ranking or ETA models follows shadow then 1 to 5 percent canary, enforcing session stickiness so individual users see consistent predictions. Each canary stage runs for minutes to hours depending on metric confidence: infrastructure metrics like p99 latency stabilize in 5 to 30 minutes, but business KPIs like conversion rate need hours to achieve statistical significance.
The Speed vs Safety Trade-off
Blue green catches regressions quickly across 100 percent of traffic but risks larger blast radius. If the new model has a bug, all users are affected until rollback completes. Canary limits impact to small cohorts but extends rollout timelines and requires statistical rigor to detect small KPI deltas. Detecting a 0.5 percent conversion rate drop requires tens of thousands of samples, which might take hours at lower traffic volumes.
Production Configuration
LinkedIn runs billions of predictions daily with tens of milliseconds p99 per subcall budgets; their canaries start at 1 percent with strict guardrails on latency inflation and error rate spikes to protect aggregate page load times. Guardrails include: p99 latency inflation less than 20 percent, error rate increase less than 0.5 percentage points, CPU utilization delta less than 10 percent. Any breach triggers automatic rollback within 5 minutes.