The Ties that un-Bind: Decoupling IP from web services and sockets for robust addressing agility at CDN-scale

Published January 13, 2022

Found something wrong? Submit a pull request!

Discussion on Hacker News

This week’s paper is from a conference earlier in 2021 (SIGCOMM 2021). I’m also trying out a new format for the paper reviews, your thoughts are greatly appreciated. These paper reviews can be delivered weekly to your inbox, or you can subscribe to the Atom feed. As always, feel free to reach out on Twitter with feedback or suggestions!

The Ties that un-Bind: Decoupling IP from web services and sockets for robust addressing agility at CDN-scale

What is the research?

The research in The Ties that un-Bind: Decoupling IP from web services and sockets for robust addressing agility at CDN-scale describes Cloudflare’s work to decouple networking concepts (hostnames and sockets) from IP addresses.

By decoupling hostnames and sockets from addresses, Cloudflare’s infrastructure can quickly change the machines that serve traffic for a given host, as well as the services running on each host - the authors call this approach addressing agility.

What are the paper’s motivations?

The paper notes reducing IP address use as the initial motivation for decoupling IP addresses from hostnames. The authors argue that CDNs don’t necessarily need large numbers of IP addresses to operate - this is in contrast with the fact that, “large CDNs have acquired a massive number of IP addresses: At the time of this writing, Cloudflare has 1.7M IPv4 addresses, Akamai has 12M, and Amazon AWS has over 51M!”

Traditionally, many CDNs use large numbers of IPs because their architecture (shown in the figure below) places entry and exit points on the public internet - entry points receive requests from clients, while exit points make requests to origin servers on cache missThese docs on “What is an origin server?” are helpful. . For these machines to be reachable, they need public IP addresses.

Other factors can increase a CDN’s IP address usage. CDNs may bind specific IPs to hostnames, creating a relationship between the number of hostnames served by the CDN and the number of addresses the CDN requires. Furthermore, CDN servers normally have an upper bound on networking socketsNetwork connections have read/write buffers for connections, as well as a kernel data structure called sk_buff. More info here. , so increased client usage also translates into more machines (and associated IP addresses).

How does it work?

The paper focuses on two types of bindings:

First, the paper describes how Cloudflare can quickly and dynamically update hostname-to-address bindings by changing configurations called policies - DNS servers ingest policies and use them to decide which IP addresses to return for a given hostname.

One example policy allows hostnames to map to IP addresses randomly chosen from a set of candidates (called a pool). Using policies instead of fixed mappings from hostnames to IP addresses is in contrast with other deployments, where changing hostname-to-IP address mappings is both operationally complex and error-prone.

The second major change to IP addressing decouples address-to-socket bindings.

Normally a service receives traffic on a fixed set of ports - this approach has several downsides, including that each socket has overhead (meaning a fixed number of services can run on each machine) and it isn’t possible to run two services with overlapping ports on the same machine without complicationsThe paper notes one approach, using INADDR_ANY (relevant documentation here), that allows one socket to receive packets sent to all interfaces on a machine. This approach doesn’t come without its downsides, like potentially introducing security issues - if internal traffic goes to the same socket as external traffic, an internal service could accidentally respond to external requests. (as they can’t re-use the same port).

To addresses these challenges, Cloudflare’s system introduces programmable socket lookupThe Cloudflare blog has more background here. , using BPFeBPF/BPF have come up a few times in past paper reviews, and I really like this post from Julia Evans on the topic. (as part of the implementation, the authors built sk_lookupThere is also a greta tutorial here. ). This approach routes traffic inside of the kernel based on rules. An example rule could route client traffic to different instances of the same service running side-by-side, with a separate socket for every instance.

Why does the research matter?

The paper discusses a number of performance and security benefits that addressing agility provides - importantly, these benefits are available with no discernible change to other important system metrics!

First, decoupling hostname-to-address and address-to-socket bindings allows the Cloudlare CDNThe paper notes that the approach is transferable to external deployments as well, with a few caveats. to operate with fewer IPs. Addresses no longer need to be reserved for use by a specific host name and machines can now have significantly more sockets. Fewer IP addresses impacts cost and lowers barrier to entry - the paper notes that the IP space owned by the major cloud providers is worth north of 500 million USD (if not more).

The IP addresses that Cloudflare does continue to use are also become easier to manage. Dynamically allocating IP addresses to hostnames turns the operational task of taking machines (and the associated addresses) offline into a matter of removing addresses from the pool provided to clients.

Furthermore, the randomization approach described (where IP addressess from a pool are returned in response to DNS queries) by the paper results in better load balancing.

While the paper discusses the scalability benefits addressing agility provides, it also discusses other implications beyond limiting address use - as an example, the approach can help with Denial of Service attacks.

If a specific address is under attack, the traffic to that address can be blackholedCloudflare’s reference on blachole routing here. . If a hostname is under attack, the traffic to that hostname will be distributed evenly across machines in the address pool.

Follow me on Twitter or subscribe below to get future paper reviews. Published weekly.

Found something wrong? Submit a pull request!