Many tasks for understanding and managing the execution of systems such as debugging, snapshotting, monitoring, accounting, and providing performance guarantees, are much harder in distributedsettings. Correspondingly, many techniques such as distributed timestamps, end-to-end tracing, and taint tracking have been successfully used to help with these tasks. Their deployment, however, is usually fraught with difficulties, including intrusive instrumentation and lack of pervasiveness. In this talk I describe some of the recent successes we've had with these mechanisms, including Pivot Tracing (SOSP 2015) and Retro (NSDI 2015), and will outline a vision for a tracing plane, layered architecture that factors primitives that are common to all these techniques -- most importantly the causal propagation of generic metadata -- with the goal of simplifying the instrumentation of current and new systems, and lowering the barrier for the adoption of these and novel techniques.
Rodrigo Fonseca is an assistant professor at Brown University's Computer Science Department. He holds a PhD from UC Berkeley, and prior to Brown was a visiting researcher at Yahoo! Research. He is broadly interested in networking, distributed systems, and operating systems, and is the recipient of an NSF CAREER award, and of a 2015 SOSP Best Paper Award. His research involves seeking better ways to build, operate, and diagnose distributed systems, including large-scale internet systems, cloud computing, and mobile computing. He is currently working on dynamic tracing infrastructures for these systems, on new ways to leverage network programmability, and on better ways to manage energy usage in mobile devices.