Ilias Diakonikolas and Christos Tzamos, two UW-Madison Computer Sciences professors, have earned the outstanding paper award at NeurIPS 2019, the Conference on Neural Information Processing Systems. NeurIPS is one of the most prestigious conferences on artificial intelligence (AI) in the academic arena. This year the conference received 6743 submissions, of which 1428 were accepted.
The paper is entitled “Distribution-Independent PAC Learning of Halfspaces with Massart Noise” and is coauthored by Diakonikolas, Tzamos, and Themis Gouleakis (currently at the Max Planck Institute, formerly a postdoctoral researcher working with Diakonikolas). The NeurIPS Outstanding Paper Committee looks for papers that are revolutionary and will change the way people think about the topic, surprise the reader with their creativity, endure far into the future, and demonstrate both elegance and rigour.
Fitting a model to a training dataset is the prototypical goal in machine learning, according to Diakonikolas. A ubiquitous example is that of linear classification, where a decision is based on the value of a linear combination of the features. Linear classification is well-understood in the case of clean training data and several efficient methods are known for it. If, however, the training data has just a few noisy labels, standard data-fitting methods inherently break down. This paper gives a new efficient method for linear classification that is robust to a challenging model of random noise, where the amount of noise added to each label is bounded, but is otherwise unknown.
In more technical terms, the work gives the first efficient algorithm for learning linear separators in the presence of Massart noise. “The distinguishing feature of the algorithm is that it succeeds without any assumptions on the data distribution,” says Diakonikolas. This work made the first progress on a question posed by several researchers, starting in the 1980s.
When the paper was submitted, Diakonikolas was a faculty member at the University of Southern California, and he performed some of the work that made the paper possible while at the Simons Institute for the Theory of Computing at the University of California-Berkeley.