Machine learning models trained on massive datasets power a number of applications; from machine translation to detecting supernovae in astrophysics. However the end of Moore’s law and the shift towards distributed computing architectures presents many new challenges for building and executing such applications in a scalable fashion.
In this talk I will present my research on systems that make it easier to develop new machine learning applications and scale them while achieving high performance. I will first present programming models that let users easily build distributed machine learning applications. Next, I will show how we can exploit the structure of machine learning workloads to build low-overhead performance models that can help users understand scalability and simplify large scale deployments. Finally, I will describe scheduling techniques that can improve scalability and achieve high performance when using distributed data processing frameworks.
Bio: Shivaram Venkataraman is a PhD Candidate at the University of California, Berkeley and is advised by Mike Franklin and Ion Stoica. His research interests are in designing systems and algorithms for large scale data processing and machine-learning. He is a recipient of the Siebel Scholarship and best-of-conference citations at VLDB and KDD. Before coming to Berkeley, he completed his M.S at the University of Illinois, Urbana-Champaign.