Ethan Katz-Bassett, Matt Calder, and Team Win Best Paper Award at SIGCOMM 2021

October 29, 2021

ACM SIGCOMM is the flagship annual conference of the ACM Special Interest Group on Data Communication (SIGCOMM). The paper, “Seven Years in the Life of Hypergiants’ Off-Nets” uncovers and maps the expansion of content delivery networks (CDNs) which are groups of geographically distributed servers that speed up the delivery of web content by bringing it closer to where users are. 

On the team is:

The work that the paper highlights is important for helping researchers understand trends shaping the Internet, which is often opaque due to the distributed administration of the Internet (made up of tens or hundreds of thousands of networks). 

The changes uncovered by the paper can also have implications for economists trying to model the Internet, since they impact the relationships between ISPs and content providers, and to regulators for policies such as net neutrality, which often treat local traffic differently.

Two trends that have transformed Internet traffic are:

  • Traffic has consolidated greatly. In 2007, it took 1000s of networks to account for half of the Internet's traffic. By 2009, it only took 150. Today, it's 5 -- these and the other large ones are the so-called "hypergiants" (Microsoft, Akamai, Google, Facebook, Netflix, Amazon, Cloudflare, Alibaba, etc)
     
  • With the rise of streaming video and other similar rich applications, the amount of traffic has increased a ton.

The hypergiants use various strategies to optimize their deployments to deliver this content. They deploy servers on their own networks - so-called "on-nets." Some of them also deploy servers in ISPs to serve the customers of the ISP locally - so-called "off-nets" because they are off the hypergiant's own network.

These off-nets are hard to detect using traditional measurement techniques, because they use IP addresses from the ISP, rather than from the hypergiant, so you don't know where to look, and because they only serve the local customers, so you'd need vantage points everywhere (in tens or hundreds of thousands of networks around the world) to find them by getting served by them. 

The team developed a technique that lets them uncover essentially all off-nets for all hypergiants. It was enabled by the fact that more and more Internet traffic is encrypted using HTTPS. HTTPS works by having the server provide an unforgeable certificate saying that it is allowed to serve that site (for example, it can serve youtube.com). The team's technique works by scanning all IPv4 IP addresses to check if they have certificates for the hypergiants. The group also applies fingerprints and filters to weed out other situations, like Netflix deploying some of their services on Amazon cloud (which the group would not consider part of the Netflix off-net deployment, because they are looking for servers that Netflix operates, not ones operated by Amazon). 

Crucially, these types of scans are available going back to 2013, even though they weren't collected for this purpose. The paper captures the changing deployments over this period of explosive hypergiant growth, including the births of content serving infrastructure from Facebook, Netflix, and Alibaba. Consequently, these large Hypergiants can serve large fractions of the world’s Internet users directly from within the users’ networks. Recently, growth has been fast in Europe, Asia, and, especially, Latin America. This study opens interesting research directions on Internet privatization, content delivery, and security practices.


Ethan Katz-Bassett completed his PhD in 2012 at the University of Washington, advised by Tom Anderson and Arvind Krishnamurthy, and then worked at Google as part of a team tasked with making the mobile web fast. In 2017 he joined Columbia University and is currently an Associate Professor in the Electrical Engineering department. His research goal is to make Internet services reliable and fast. 

Matt Calder finished his PhD at the University of Southern California in 2019 and joined the Data Science Institute in the summer of 2020 as a research scientist. He currently works on research related to Internet measurement and topology, traffic engineering, and capacity planning. He is also an Applied Science Leads at Microsoft where he works on network capacity planning for Microsoft’s global network.