Marc Snir: Resilience at Exascale
It is often feared that the growing frequency of hardware errors will be a major obstacle to the deployment of exascale systems. The Department of Energy held several workshops to study this issue, including a week-long workshop organized by the Institute for Computing Sciences. This workshop brought together leading researchers in circuits, architecture, operating systems and applications. The workshop produced a report that indicates possible scenarios for handling resilience at exascale and required research to achieve progress in this area. Snir will discuss this report, indicating the questions it raises and research directions it identifies.
Marc Snir is Director of the Mathematics and Computer Science Division at the Argonne National Laboratory and Michael Faiman and Saburo Muroga Professor in the Department of Computer Science at the University of Illinois at Urbana-Champaign. He currently pursues research in parallel computing.
He was head of the Computer Science Department from 2001 to 2007. Until 2001 he was a senior manager at the IBM T. J. Watson Research Center where he led the Scalable Parallel Systems research group that was responsible for major contributions to the IBM SP scalable parallel system and to the IBM Blue Gene system.
Marc Snir received a Ph.D. in Mathematics from the Hebrew University of Jerusalem in 1979, worked at NYU on the NYU Ultracomputer project in 1980-1982, and was at the Hebrew University of Jerusalem in 1982-1986, before joining IBM. Marc Snir was a major contributor to the design of the Message Passing Interface. He has published numerous papers and given many presentations on computational complexity, parallel algorithms, parallel architectures, interconnection networks, parallel languages and libraries and parallel programming environments.
Marc is Argonne Distinguished Fellow, AAAS Fellow, ACM Fellow and IEEE Fellow. He has Erdos number 2 and is a mathematical descendant of Jacques Salomon Hadamard.
Hosted by the Center for High Throughput Computing
