Learn→A/B Testing & Experimentation→Experiment Design (Randomization, Stratification, Power Analysis)→5 of 6

A/B Testing & Experimentation • Experiment Design (Randomization, Stratification, Power Analysis)Hard⏱️ ~3 min

What Are Sample Ratio Mismatch and Identity Churn Failures?

Definition
Sample Ratio Mismatch (SRM) occurs when observed user counts in control vs treatment deviate from intended allocation. Even 0.5-1% imbalance signals potential bugs that invalidate results.
Why SRM Matters
For a 50/50 split with 100,000 users, you expect roughly 50,000 each. Getting 51,000/49,000 (2% deviation) is highly unlikely by random chance (p < 0.001). This signals systematic bias: bucketing bugs, logging drops, or differential eligibility. Results from SRM experiments cannot be trusted.
Common causes: treatment crashes more often (dropping users from logs), treatment loads slower (users abandon before logging), redirect-based treatments drop users who dont follow redirects, bot traffic is unevenly distributed.
Detection
Run chi-squared test on observed vs expected counts. Alert if p < 0.001 (strong SRM) or p < 0.01 (concerning SRM). Check SRM daily during experiment, not just at end. SRM appearing mid-experiment indicates a deployment or logging change.
⚠️ Key Trade-off: You cannot fix SRM by reweighting. If treatment drops 2% of users, those dropped users are systematically different (slower devices, less patience). No statistical adjustment recovers unbiased estimates.
Identity Churn
Users who clear cookies, switch devices, or reinstall apps may get reassigned to different variants. This appears as SRM plus contaminated within-user comparisons. Track identity stability metrics and exclude high-churn users from analysis.

💡 Key Takeaways

✓Even 0.5-1% sample ratio deviation signals systematic bias; results cannot be trusted

✓Common causes: treatment crashes, slow loading, redirect drops, uneven bot traffic

✓Run chi-squared test daily; alert at p < 0.001 (strong SRM) or p < 0.01 (concerning)

✓Cannot fix SRM by reweighting - dropped users are systematically different

📌 Interview Tips

1When asked about SRM causes: list treatment crashes, slow loading, redirect drops, bot traffic

2For detection: describe chi-squared test on observed vs expected with daily monitoring

← Back to Experiment Design (Randomization, Stratification, Power Analysis) Overview