[an error occurred while processing this directive]
 
   
   

Faculty Candidate Talk: A Theory of Similarity Functions for Learning and Clustering

Maria-Florina (Nina) Balcan
Carnegie Mellon University
Monday, February 25, 2008
4:00 p.m., 1221 CS (Cookies: 3:30 p.m., 2310 CS)

Abstract:

One of the most powerful tools developed in machine learning in recent years is the class of kernel methods. These methods perform well in many applications, and there is also a well-developed theory of when a given kernel is useful for a given learning problem. However, while a kernel can be thought of as just a pairwise similarity function that satisfies additional mathematical properties, the existing theory requires viewing kernels as implicit (and often difficult to characterize) maps into high-dimensional spaces. In this work we develop an alternative theory of learning with more general similarity functions, which requires neither reference to implicit spaces, nor the function to be positive semi-definite. Our results strictly generalize the standard theory, and any good kernel function under the usual definition can be shown to also be a good similarity function under our definition.

We then show how our framework can also be applied to clustering: multi-way classification from purely unlabeled data. In particular, using this perspective we develop a new model that directly addresses the fundamental question of what kind of information a clustering algorithm needs in order to produce a highly accurate partition of the data. Our work can be viewed as an approach to defining a discriminative model for clustering with non-interactive feedback.

Speaker's bio:

Maria-Florina Balcan is a Ph.D. candidate at Carnegie Mellon University under the supervision of Avrim Blum. She received B.S. and M.S. degrees from the Faculty of Mathematics, University of Bucharest, Romania. Her main research interests are Computational and Statistical Machine Learning, Computational Aspects in Economics and Game Theory, and Algorithms. She is a recipient of the IBM PhD Fellowship. [an error occurred while processing this directive]