Design Fundamentals • URL Shortener DesignMedium⏱️ ~3 min
Analytics, Safety, and the Write Path Trade offs
The write path (shortening a URL) involves several stages beyond simple token generation and storage: validation and canonicalization of the input URL, safety and abuse checks, analytics event emission, and cache priming. Canonicalization normalizes the URL to prevent duplicate short links for semantically identical destinations (e.g., treating http://example.com, http://example.com/, and http://EXAMPLE.COM as identical). This involves scheme normalization, host lowercasing, punycode conversion for internationalized domain names, percent encoding normalization, and default port removal. However, over aggressive canonicalization risks conflating truly distinct URLs (e.g., case sensitive path parameters), so production systems carefully tune rules to balance deduplication and correctness.
Safety and abuse checks are critical for maintaining domain reputation and user trust, but they add latency and complexity. Synchronous checks (querying a URL reputation service or blocklist before generating the short link) can add 50 to 200 milliseconds per write, degrading user experience. Asynchronous checks decouple safety from the write path: accept the URL, generate the token, respond immediately, and scan the destination URL out of band via a queue. If abuse is detected later, mark the token as unsafe and display an interstitial or block redirects. This trades lower write latency for a window where malicious links are accessible. Systems typically combine lightweight synchronous checks (local blocklist lookup in under 1 millisecond) with heavy asynchronous scanning (content fetching, reputation lookups, ML models).
Analytics follow a similar pattern: inline counters (incrementing a click count in the database during each redirect) provide perfect accuracy but add write latency, contention on hot keys, and database load. At 8,000 redirects per second, synchronous increments would require 8,000 writes per second to a counter table, creating hotspots and limiting scalability. Asynchronous analytics emit an event per redirect (token, timestamp, coarse geolocation from IP, user agent, referrer) to a queue (Kafka, Kinesis, or Pub Sub) and process in batches. This scales horizontally and supports complex aggregations, but introduces at least once delivery semantics requiring idempotency (use unique event IDs and deduplication) to prevent double counting. Sampling (recording only 10% or 1% of events) reduces volume but biases metrics. Most production systems use full asynchronous event streaming with deduplication and accept eventual consistency (dashboards update within seconds to minutes).
💡 Key Takeaways
•URL canonicalization prevents duplicates: normalize scheme, lowercase host, convert IDNs to punycode, remove default ports; over normalization can wrongly conflate distinct URLs with case sensitive paths
•Synchronous safety checks add 50 to 200 milliseconds per write; asynchronous scanning decouples from write path, achieving sub 10 millisecond response times with a window where malicious links are active
•Inline analytics counters at 8,000 redirects per second require 8,000 writes per second, creating database hotspots; asynchronous event streaming scales horizontally and supports complex aggregations
•At least once event delivery in asynchronous analytics requires idempotency with unique event IDs and deduplication to prevent double counting; without this, metrics can be inflated by 10% or more
•Combining lightweight synchronous checks (local blocklist lookup under 1 millisecond) with heavy asynchronous checks (content fetch, ML models) balances safety, user experience, and scalability
•Cache priming on the write path (write through) ensures new short URLs are immediately available in cache, avoiding cold start misses for the first few redirects but adding write latency
📌 Examples
Bitly performs asynchronous safety scanning by enqueueing destination URLs to a Kafka topic; workers fetch content, check reputation APIs, and run phishing detection models, marking unsafe tokens within seconds
Google's g.co emits redirect events to a Pub Sub topic with unique event IDs; downstream BigQuery jobs deduplicate using event IDs and aggregate metrics, updating dashboards every 5 minutes with eventual consistency
A URL shortener that incremented a click counter synchronously during redirects saw database CPU spike to 90% at 5,000 QPS; switching to asynchronous event streaming reduced database load by 80% and scaled to 20,000 QPS