PL Technology for Data Analysts

Tuesday, September 8, 2015 - 1:00pm
CS 2310

Speaker Name: 

Dan Barowy

Speaker Institution: 

University of Massachusetts-Amherst




The rapid growth in demand for data scientists and analysts suggests that data-driven decision making is increasingly valued. Nonetheless, in practice, these workers spend between 50% to 80% of their time performing "data wrangling" tasks: transforming data and removing errors. We can expect this problem to become even more severe as our volume of data grows.

I will present two spreadsheet tools that are designed to reduce the data wrangling overhead. The first tool, FlashRelate (PLDI 2015), transforms information stored in ad-hoc spreadsheet layouts into relational tables. Users provide FlashRelate with examples of the transformed output and the engine synthesizes the appropriate transformation program. The second tool, CheckCell (OOPSLA 2014), identifies the set of inputs most likely to cause anomalous program behavior. Flagged inputs are either errors or are very important and warrant careful inspection. Both tools are designed for non-programmers, and are driven via point-and-click interfaces. FlashRelate won the PLDI 2015 Distinguished Artifact award and CheckCell won the 2013 MSR Software Engineering Innovation Foundation award.

About the speaker:

Daniel Barowy is a 6th year graduate student at the College of Information and Computer Sciences at the University of Massachusetts Amherst, currently working with Emery Berger in the PLASMA Lab. His interest lies in the design and implementation of programming languages, particularly for novice users. His work focuses on program reliability, tools for end-users, and debuggers. His work on AutoMan, a domain-specific language for reliable crowdsourced computation (OOPSLA '12) was recently selected as a CACM Research Highlight (to appear in Jan '16), and his work on FlashRelate (PLDI '15) won the PLDI 2015 Distinguished Artifact Award. Dan interned at IBM T.J. Watson and Microsoft Research {New York City, Silicon Valley, and Redmond}. He received his B.S. in Computer Science from Boston University and his B.A. in Philosophy and Legal Studies from the University of Massachusetts Amherst.