Implementation: Traffic Routing, Metric Collection, and Decision Engine
TRAFFIC ROUTING IMPLEMENTATION
Implement consistent hashing in your load balancer or application layer. bucket = hash(user_id) mod 10000. At 5% canary, route buckets 0-499 to new version. For stratified sampling, first assign users to segments (mobile buckets 0-5999, desktop 6000-9999), then apply percentage within each segment. Store the assignment decision in a cookie or session to ensure sticky routing.
METRIC COLLECTION PIPELINE
Tag every request with: version (control/canary), cohort bucket, user segment, timestamp. Aggregate at 1-5 minute granularity for fast feedback. Use 5-15 minute trailing windows for comparison. Stream events to a real-time aggregation system that computes running statistics: mean latency, error rate, CTR by cohort. Store raw events for post-hoc analysis.
TWO-LAYER DECISION ENGINE
Layer 1 (automatic): Guardrail violations trigger immediate rollback. Error rate delta > 0.1% for 5 minutes, P99 > 300ms for 10 minutes, or feature null rate > 1%. No human intervention required.
Layer 2 (statistical): Product metrics require statistical tests with CUPED adjustment and multiple comparison correction. Compute confidence intervals and p-values. Alert humans for borderline cases.
RAMP SCHEDULE WITH GATES
Typical schedule: 0.5% → 1% → 5% → 10% → 25% → 50% → 100%. Between each step, verify: latency delta within bounds, error rate stable, downstream service headroom sufficient, no data quality alerts. Log each gate decision for audit. Full ramp typically takes 24-48 hours with conservative gates.