Three-Tier Guardrail Framework
Tier 1: Immediate Guardrails
Detect within minutes to hours. Error rates, crash rates, latency p99. These catch catastrophic failures fast. Threshold example: >1% error rate increase triggers automatic rollback. Measured via real-time streaming, checked every 5-15 minutes.
Tier 2: Session-Level Guardrails
Detect within hours to 1 day. Session completion rate, add-to-cart rate, bounce rate. These catch UX degradation before it compounds. Threshold example: >5% session bounce increase blocks experiment. Requires waiting for sessions to complete before measurement.
Tier 3: Long-Term Guardrails
Detect over days to weeks. 7-day retention, repeat purchase rate, NPS. These catch delayed harm that session metrics miss. Threshold example: >2% retention drop stops experiment. Requires longer runtime but catches the most important issues.
Action Matrix
Each tier has a response: Tier 1 = auto-rollback (immediate), Tier 2 = pause + alert (same day), Tier 3 = decision input (end of experiment). This tiered approach balances speed with thoroughness.