Chapter 19, “Load Balancing at the Frontend”, The Site Reliability Engineering Book.
Disclaimer: These notes are my own, and they may not fully appreciate the intention of the authors or the contents of the book.
The chapter describes a possible design to solve challenges when using multiple data-centres and multiple front-line machines within a data-centre to service user requests (collectively referred to as Google’s “Frontend”). Multiple data-centres that are geographically spread-out and internally load-balancing within them is necessary to have redundancy, capacity, and user proximity. But this introduces challenges such as deciding which data-centre serves a particular kind of user request with respect to service-level and technical constraints, load-balancing that scales with operational and capacity constraints, and on how to implement such mechanisms.
A multi-level solution is used, and the first-level mechanic is DNS. Anycast DNS is used on the side of the authoritative nameserver to approximate a user’s location, with heuristics and new techniques (EDNS0) to account for recursive resolvers. This information used to direct the user to the “optimal” data-centre.
The second-level mechanic is virtual IP addresses whereby multiple network interfaces receive requests addressed to one public IP address. Here, virtual means the IP address is as transparent as an IP address bound to a singular network card from a user request perspective. Scalable mechanics (connection tracking, consistent hashing) are used to distribute traffic using a network load balancer to implement virtual IP addresses. Network address translation (NAT) is avoided in favour of tunnelled direct server return (L3 DSR) for scalability.
Frankly Asked Questions
(Things I didn’t quite get – maybe they were answered and I misread, don’t blame me if it was already answered or if I’m wrong about them!)
Question: Why not just use consistent hashing at the network load balancer all the time? Why only use it as a fallback and prefer connection state tracking?
Idea: Consistent hashing is not 100% stable. There is still a possibility that a connection is terminated due to changes in buckets (e.g. during deployments, autoscaling). Connection tracking doesn’t have this problem, which is why consistent hashing is only used as a fallback.
Question: Why not just apply anycast to the Virtual IP?
Idea: 1. Connection resets when roaming to different peering arrangements. 2. Unable to shape traffic, left at mercy of third party routing policies. [See: Anycast Is Not Load Balancing]
Question: What if there was a way to attempt connection to all addresses in a DNS record and just use the first one that accepts your connection? Can’t you just list all of your virtual IP addresses in a single DNS record served globally? (Happy Eyeballs)
Idea: Like anycast, by doing this you lose the ability to shape traffic. Like the book mentions, if SRV records with weights were adopted, maybe this could have worked.