Alekh Agarwal (WID Optimization Faculty Candidate): Statistics meets Computation: Exploiting structure for efficient learning with large data-sets
The past decade has seen the emergence of datasets of an unprecedented scale, with both large sample sizes and dimensionality. Massive data sets arise in various domains, among them computer vision, natural language processing, computational biology, social networks analysis and recommendation systems, to name a few. In many such problems, the bottleneck is not just the number of data samples, but also the computational resources available to process the data. An important goal in such problems is to understand how we might leverage the statistical structure of the problem to get efficient computational procedures.<?xml:namespace prefix = o />
In this talk, I present three research threads that provide different lines of attack on this broader research agenda: (i) distributed algorithms for statistical inference; (ii) interplay between statistical and computational complexities in structured high-dimensional estimation; and (iii) some computational trade-offs in noisy matrix decomposition.
In the first two examples, we will demonstrate how we can beat the worst-case complexity of the computational problem by exploiting the statistical structure.
The last part will show an example where we get more and more computationally expensive solutions, based on the statistical demands we place on the algorithm.
[Joint work with John Duchi, Sahand Negahban, Peter Bartlett and Martin Wainwright
