My research focuses on the theoretical aspects of modern data management systems, and in particular those related to the processing of big data, which is typically huge, noisy and valuable. In this talk, I will focus on two fundamental problems: processing big data with massive parallelism and pricing data.
In the first part, I will look into modern data analytics systems that use large-scale parallelism to analyze massive datasets, and present a theoretical model that captures the complexity of query processing in such systems. The main parameter is the communication cost, which becomes a dominating factor because of the need to reshuffle the data at each round of the computation. Based on this model, I will discuss how we can design efficient parallel algorithms for query processing with formal guarantees on their behavior.
In the second part, I will show how the fact that data has become a valuable commodity that can be bought and sold motivates the problem of pricing data. I will then discuss the properties that the assigned prices must satisfy, and address the challenges of building a practical pricing system with such formal guarantees.
Bio: Paris Koutris is a Ph.D. candidate in the Computer Science & Engineering Department at the University of Washington, advised by Dan Suciu. His research focuses on the theoretical aspects of data management, and in particular on problems that arise in modern applications and systems for processing big data. He received his Diploma from the National Technical University of Athens and also completed a M.Sc. Degree in Logic, Algorithms and Computation at the University of Athens.