Improving Network Performance Through Endpoint Diagnosis

Date: 2:00pm, March 3, 2017
Location:  MUDD 327
Speaker:  Behnaz Arzani, PhD. Candidate-University of Pennsylvania

Abstract:  Components within a data center and the Internet can fail. When failures occur, users of the network (clients) do not have access to the various components of this complex distributed systems to determine the cause of failures. In this talk, I present my work on endpoint diagnosis, where the aim is to provide tools to help clients nd the cause of these failures. e proposed solution is based on a two-step approach where the endpoint identies the entity responsible for failures without requiring any support from the network or the remote end hosts. If the network is determined as the cause of the failure, a second step is triggered to identify the device responsible for the failure. In order to achieve this goal, we tackle the research challenge of inferring the cause of data center failures using only TCP statistics collected at one of the endpoints. To validate this approach, we have developed two monitoring tools NetPoirot and Vigil. NetPoirot detects the right entity (storage, compute, or network) to assist in the failure resolution process. Vigil further closes the network diagnostics gap by pinpointing the specic network entity that causes the failure. Our results on a large production datacenter show that NetPoirot and Vigil can eectively identify faults in a data center with low overhead while providing accuracies as high as 90%.

Biography:  Behnaz Arzani is a PhD candidate at the Computer and Information Science department at the University of Pennsylvania. She received her undergraduate degree in Engineering from the Sharif University of Technology. Her research interests are broadly in the design and implementation of practical networks with strong theoretical and mathematical foundations. Her current research focuses on failure resilience and diagnosis in data center networks, and multipath routing for video delivery. During her doctoral studies, she has collaborated with NEC Research and Microso Research and the Azure team, resulting in tools deployed in production data centers and patents.

500 W. 120th St., Mudd 1310, New York, NY 10027    212-854-3105               
©2014 Columbia University