DNS Resolution Architecture and Caching Hierarchy

DNS operates as a hierarchical, globally distributed naming system where a client's stub resolver (built into the OS) sends queries to a recursive resolver, which then traverses the DNS hierarchy: root servers, top level domain (TLD) servers, and finally authoritative nameservers. This traversal uses referrals at each layer until reaching the authoritative source for the requested zone. The system's performance relies fundamentally on caching at every layer including browser, OS, recursive resolver, and even authoritative servers.

Time To Live (TTL) values control how long DNS answers can be cached before requiring revalidation. Google Public DNS (8.8.8.8), deployed across 200+ locations, achieves cache hit rates above 80 to 95 percent, resulting in median resolution latency of single digit to tens of milliseconds for cache hits. Cold misses that must traverse the full hierarchy typically add 50 to 300 ms depending on geography. The 13 root server letters collectively comprise over 1000 anycast instances globally, handling 100+ billion queries daily, though most user queries never reach the root due to caching effectiveness.

Resolution predominantly uses UDP port 53 for low latency, with TCP fallback triggered when responses are truncated or when DNSSEC validation requires larger payloads. Modern resolvers advertise Extended DNS (EDNS0) to support larger response sizes, though operators often keep responses conservative to avoid IP fragmentation which causes packet loss. Anycast routing is the standard deployment model: the same IP address is announced from many geographic locations, and BGP routes clients to the nearest instance, providing both low latency and high availability through redundancy.

💡 Key Takeaways

✓Recursive resolvers traverse root, TLD, and authoritative layers using referrals; typical path adds 50 to 300 ms on cold misses but cache hit rates of 80 to 95 percent keep median latency under 20 to 30 ms in well peered regions

✓Root DNS infrastructure comprises 13 letter designations with 1000+ anycast instances handling 100+ billion queries per day; typical RTT to nearest root instance is single digit milliseconds

✓TTL values determine cache duration at every layer; low TTLs (10 to 60 seconds) enable faster failover but increase query volume by up to 10x, while high TTLs (5 to 24 hours) reduce cost but slow propagation

✓Negative caching for NXDOMAIN and NODATA responses is controlled by the SOA negative TTL field, typically set to minutes, reducing repeated queries for nonexistent names

✓UDP is preferred for queries due to low overhead, but responses exceeding path MTU trigger truncation (TC=1 flag) forcing TCP retry which adds one full RTT and increases server CPU load

✓Cloudflare's 1.1.1.1 resolver deployed in 285+ cities demonstrates production anycast scale, achieving p50 latency of 10 to 20 ms regionally and sub 100 ms globally

📌 Interview Tips

1Google Public DNS achieves 80 to 95 percent cache hit rates across 200+ global locations, keeping median resolution time under 20 ms for cached entries versus 50 to 150 ms for full hierarchy traversals

2A popular CDN setting TTL to 30 seconds on traffic steering records can shift traffic within one minute during failures, but generates 10x more queries compared to a 300 second TTL

3Root servers handle primarily misconfiguration noise; legitimate user queries rarely hit roots due to recursive resolver caching, with most resolvers querying roots only for new TLDs or cache expiry

← Back to DNS & Domain Resolution Overview