March 10, 2015
Speaker: Kristen Gardner, Department of Computer Science, Carnegie Mellon University
Recent computer systems research has proposed using redundant requests to reduce latency. The idea is to run a single request on multiple servers and only wait for the first completion (discarding all remaining instances of the request). However no exact analysis of systems with redundancy exists. We present the first exact analysis of systems with redundancy. We allow for any number of classes of redundant requests, any number of classes of non-redundant requests, any degree of redundancy, and any number of heterogeneous servers. In all cases we derive the limiting distribution on the state of the system.
In small (two or three server) systems, we derive simple forms for the distribution of response time of both the redundant classes and non-redundant classes, and we quantify the “gain” to redundant classes and “pain” to non-redundant classes caused by redundancy. We find some surprising results. We also compare redundancy with other approaches for reducing latency, such as optimal probabilistic splitting of a class among servers (Opt-Split) and Join-the-Shortest-Queue (JSQ) routing of a class. We find that redundancy outperforms JSQ andOpt-Split with respect to overall response time, making it an attractive solution. We also investigate performance in scaled systems, and find that many of our counterintuitive resultscontinue to hold as the number of servers increases.
To appear in: ACM SIGMETRICS 2015
Kristy Gardner is a Ph.D. student in the Computer Science Department at Carnegie Mellon University, working with Mor Harchol-Balter. She is interested in scheduling and queueing theory, particularly for systems in which different classes of jobs receive different levels of service. She received her B.A. from Amherst College in 2012.
Hosted by Ton Dieker.