In recent years, there has been an explosion of large-scale real-time analytics needs and a plethora of streaming systems have been developed to support such applications. These systems are able to continue stream processing even when faced with hardware and software failures. However, these systems do not address some crucial challenges facing their operators - the manual, time-consuming and error-prone tasks of tuning various configuration knobs to achieve service level objectives (SLOs) as well as the maintenance of SLOs in the face of sudden, unpredictable load variation and hardware or software performance degradation. In this talk, we introduce the notion of self-regulating streaming systems and the key properties that they must satisfy. We then present the design and evaluation of Dhalion, a system that provides self-regulation capabilities to underlying streaming systems. We describe our implementation of the Dhalion framework on top of Twitter Heron, as well as a number of policies that automatically reconfigure Heron topologies to meet throughput SLOs, scaling resource consumption up and down as needed. We are in the process of open-sourcing Dhalion as part of the Heron project.
Speaker bio: Avrilia Floratou is a Senior Scientist at Microsoft's Cloud and Information Services Lab (CISL). Her research interests broadly lie in the area of data management with a recent focus on large-scale stream processing and real-time analytics. She is also an active contributor to Heron, collaborating with Twitter. Prior to her current role, she was a research scientist at IBM Almaden Research Center working on SQL-on-Hadoop engines and incorporating her research in the IBM Big SQL product offering. She received her Ph.D. and M.Sc. in Computer Science from University of Wisconsin-Madison and her B.S from University of Athens in Greece.