Ramp Up Strategies: Traffic Shaping and Cohort Assignment
CONSISTENT USER ASSIGNMENT
Users must stay in the same cohort (control or canary) throughout the experiment. If a user flips between versions mid-session, you cannot attribute behavior to either. Use consistent hashing: hash(user_id) mod 10000 maps to buckets 0-9999. At 5% canary, buckets 0-499 receive the new version. User 123456 maps to bucket 6456, so they stay in control until you ramp past 35%.
STRATIFIED SAMPLING
Pure user ID hashing can create biased cohorts. If your hash function happens to put 80% mobile users in the canary when the population is 60% mobile, your metrics are skewed. Stratified sampling hashes within segments: mobile users get buckets 0-5999, desktop 6000-9999. Then apply the 5% threshold within each segment to maintain population proportions.
CAPABILITY PROBING
Not all clients support new features. If your new model requires dense embeddings but 10% of users have old app versions that do not send them, routing those users to the canary causes 10% feature null rate and potential crashes. Capability probing: client sends {supports_dense_embeddings: true, app_version: 2.5}, server routes only capable clients to the canary.
INFRASTRUCTURE COST
At 80k QPS peak, 25% canary means running both versions: control handles 60k QPS, canary handles 20k QPS. This adds 5-10% extra compute during the parallel operation period. Budget for this overhead when planning rollout schedules.