Speaker: Jun Li
Lecture in Data Science: Straggler-free Erasure Coding for Distributed Matrix Multiplication
Lecture Date: Fri, Jun 26
Lecture Time: 1:00 - 2:00 PM
Abstract: Built on top of a large number of commodity nodes, large-scale distributed systems, running multiple tasks in parallel on different nodes, suffer from unreliable performance as stragglers are the rule rather than exceptions. Although stragglers can be tolerated by replicating tasks on multiple nodes, or by relaunching the affected tasks via the detection of stragglers, it either takes a significant amount of resources or compromises the completion time. In my research, I use erasure coding to create coded tasks which efficiently tolerate stragglers with much lower overhead. In this talk, I will first give a brief overview of my research, and then introduce my recent works on erasure coding for distributed matrix multiplication, a common operation in machine learning workloads. In particular, I will present coding schemes we have designed for single and concurrent matrix multiplications with considerations of both computation and communication. Finally, I will conclude this talk by discussing my future work that integrates the design of erasure coding with machine learning.
Biosketch: Jun Li received his Ph.D. degree from the Department of Electrical and Computer Engineering, University of Toronto, in 2017, and his B.S. and M.S. degrees from the School of Computer Science, Fudan University, China, in 2009 and 2012. Since 2017, he has joined the School of Computing and Information Sciences, Florida International University, as a tenure-track assistant professor. Merging the gap between theory and practice, his research studies both theoretical and practical challenges of deploying erasure coding in distributed systems for storage, analytics, and machine learning.