Title: Speeding up Machine Learning using Graphs and Codes
Abstract: I will talk about three simple combinatorial ideas to speed up parallel and distributed learning algorithms. We will start off with serial equivalence in asynchronous parallel machine learning, its significance, and how we can efficiently achieve it using a recent result on graph phase transitions. We will continue on the issue of stragglers (i.e., slow nodes) in distributed systems; here, we will use erasure codes to robustify gradient based algorithms against delays. In our last example, we will reduce the high communication cost of data-shuffling in distributed learning, using the seemingly unrelated notion of coded caching. For all the above, we will see real world experiments that indicate how these simple ideas can significantly improve the speed and accuracy of large-scale learning.
Speaker's bio: I am an assistant professor of Electrical and Computer Engineering at the University of Wisconsin-Madison, and a faculty affiliate with the Optimization group at the Wisconsin Institute for Discovery. My research lies in the intersection of machine learning, coding theory, and distributed systems. I am particularly interested in coordination-avoiding algorithms for large-scale machine learning, and on using erasure codes to speed up distributed computation.
Before coming to Madison, I spent two wonderful years as a postdoc at UC Berkeley. At Berkeley, I was a member of the AMPLab and BLISS, and had the pleasure to collaborate with Kannan Ramchandran and Ben Recht, on several fun projects. I received my Ph.D. in 2014 from UT Austin, where I was fortunate to be advised by Alex Dimakis. Before UT, I spent 3.5 years as a grad student at USC. Before all that, I received my M.Sc. (2009) and ECE Diploma (2007) from the Technical University of Crete (TUC), located in the beautiful city of Chania.