Documentation

UW Connect

Kendrick Boyd: Precision-Recall Space and Empirical Algorithm Evaluation

Room: 
Biotechnology Center Auditorium, 425 Henry Mall
Speaker Name: 
Kendrick Boyd
Speaker Institution: 
Department of Computer Sciences, UW-Madison
Cookies: 
No

 

Abstract:
ROC curves are widely used to represent the quality of medical screening procedures such as mammography. We show that in screening for diseases with rare prevalence, such as mammography for breast cancer, precision-recall (PR) curves have some significant advantages over ROC curves. Because of these advantages, PR curves, and the areas under them, are already the evaluation metrics of choice for other tasks characterized by low prevalence, such as information retrieval.

While PR curves are frequently used as a simple replacement for ROC curves, there are subtleties regarding PR curves that must be considered. It is already known that PR curves vary as class skew varies. What was not recognized before is that there is a region of PR space that is completely unachievable, and the size of this region varies only with the skew. We precisely characterize the size of the unachievable region and discuss its implications for empirical evaluation methodology in machine learning.

Event Date:
Tuesday, March 27, 2012 - 4:00pm - 5:00pm (ended)