Computer Sciences Weekly Good News

 

Welcome to CS Good News, gathered by our gallant former department chair Remzi Arpaci-Dusseau! We’ll post these weekly – more or less. Updates are provided by CS faculty and do not cover all the good news we have in the department – that’s probably not possible. But here are some highlights. Enjoy!

This is an accordion element with a series of buttons that open and close related content panels.

April 22, 2024

  • Once again, Prof Dieter van Melkebeek has led the University of Wisconsin to great heights in the 46th ICPC finals, including an incredible 8th place finish overall (2nd highest US school, behind only MIT). [1]
  • Prof Mike Swift was named a Vilas Distinguished Achievement Professor. [2]
  • Prof Rahul Chatterjee was awarded a Reilly-Baldwin Wisconsin Idea Endowment Grant for his proposal “Supporting Survivors of Technology-Facilitated Intimate Partner Violence in Rural Wisconsin”. [3]
  • Finally, let us once again congratulate our Chair Steve Wright for being awarded the prestigious Hilldale Professorship. [4]  

Notes:

[1] https://icpc.global/

The International Collegiate Programming Contest is an algorithmic programming contest for college students. Teams of three, representing their university, work to solve the most real-world problems, fostering collaboration, creativity, innovation, and the ability to perform under pressure. Through training and competition, teams challenge each other to raise the bar on the possible. Quite simply, it is the oldest, largest, and most prestigious programming contest in the world.

Scoreboard: https://scoreboard.icpc.global/46/index.html

[2] https://news.wisc.edu/43-faculty-honored-with-vilas-professorships-and-awards/

Sixteen professors were named to Vilas Distinguished Achievement Professorships, an award recognizing distinguished scholarship as well as standout efforts in teaching and service. The professorship provides five years of flexible funding — two-thirds of which is provided by the Office of the Provost through the generosity of the Vilas trustees and one-third provided by the school or college whose dean nominated the winner.

[3] https://provost.wisc.edu/baldwin-wisconsin-idea-endowment/

“Supporting Survivors of Technology-Facilitated Intimate Partner Violence in Rural Wisconsin”

PI Rahul Chatterjee

Survivors of intimate partner violence (IPV) are regularly controlled and harassed using technological means, like stalking using malicious phone applications, online accounts, and even some tools like AirTags. Supporting survivors of IPV have been predominantly concentrated on urban settings, limited information exists regarding the unique challenges faced by rural survivors. Through an integrated research-practice collaboration, this proposal seeks to explore the distinctive issues of TIPV in rural Wisconsin and collaboratively design a remote tech clinic program with key stakeholders to support survivors in remote areas. The goal of this project is to establish a remote tech clinic program in collaboration with the strong victim service provider network already present in WI to support survivors living anywhere in Wisconsin. This initiative aims to establish a replicable framework and share open-source tools for adoption in other states. Ultimately, the project aims to improve resource accessibility, bridging the rural-urban resource disparity for tech-facilitated IPV survivors.

[4] https://news.wisc.edu/evjue-chair-hilldale-professors-named/

Hilldale Professorships are given to faculty members who excel in scholarly activity, have records of outstanding research or creative work, and show promise of continued productivity. Recipients receive a salary increase in addition to funding that may be used for research support and teaching release. Appointments are for five years with the possibility of renewal until the individual leaves the university or retires.

April 1, 2024

  • Prof Loris D’Antoni’s student, Shaan Nagy, won an NSF Graduate Student Research Fellowship. [1] 
  • Prof Josiah Hanna published “Scaling Offline Evaluation of Reinforcement Learning Agents through Abstraction” in AAAI. [2] 
  • Prof Christos Tzamos published “Contextual Pandora’s Box” in AAAI. [3] 
  • Prof Jerry Zhu published “Exact Policy Recovery in Offline RL with Both Heavy-Tailed Rewards and Data Corruption” in AAAI. [4] 
  • Prof Jerry Zhu published “Optimal Attack and Defense for Reinforcement Learning” in AAAI. [5] 
  • Prof Jerry Zhu published “Data Poisoning to Fake a Nash Equilibria for Markov Games” in AAAI. [6] 

Notes:

[1] https://pages.cs.wisc.edu/~sanagy/

“I am a second-year Computer Science PhD student working under Loris D’Antoni. My recent work has focused on semantics and verification over infinite sets of programs.

I received my BSCS in Computer Science and BS in Math from Rice University in May 2022. While there, I worked with Moshe Vardi on computing partition functions of Ising models.”

https://www.nsfgrfp.org/

[2] https://ojs.aaai.org/index.php/AAAI/article/view/30283

A critical challenge for the widescale adoption of reinforcement learning (RL) is the need to give domain experts assurance that learned policies will improve decision-making — and not lead to unacceptable behavior. To meet this challenge, my work aims to develop new methods for offline policy evaluation in real world RL domains. There has been much recent interest in offline evaluation and many advances. However, recent benchmarking efforts have also shown that there remains a substantial gap between current state-of-the-art methods and real world domains such as robotics. Towards scalable offline evaluation, my group is investigating the use of methods for abstraction and representation learning. In this new faculty highlight, I will present our recent results that show the promise of this direction for scaling offline evaluation in RL domains. I will then describe future directions in this line of that work which will further realize the promise of offline policy evaluation for increasing confidence in deployed RL.

[3] https://ojs.aaai.org/index.php/AAAI/article/view/28969/

Pandora’s Box is a fundamental stochastic optimization problem, where the decision-maker must find a good alternative, while minimizing the search cost of exploring the value of each alternative. In the original formulation, it is assumed that accurate distributions are given for the values of all the alternatives, while recent work studies the online variant of Pandora’s Box where the distributions are originally unknown. In this work, we study Pandora’s Box in the online setting, while incorporating context. At each round, we are presented with a number of alternatives each having a context, an exploration cost and an unknown value drawn from an unknown distribution that may change at every round. Our main result is a no-regret algorithm that performs comparably well against the optimal algorithm which knows all prior distributions exactly. Our algorithm works even in the bandit setting where the algorithm never learns the values of the alternatives that were not explored. The key technique that enables our result is a novel modification of the realizability condition in contextual bandits that connects a context to a sufficient statistic of each alternative’s distribution (its reservation value) rather than its mean.

[4] https://ojs.aaai.org/index.php/AAAI/article/view/29022

We study offline reinforcement learning (RL) with heavy-tailed reward distribution and data corruption: (i) Moving beyond subGaussian reward distribution, we allow the rewards to have infinite variances; (ii) We allow corruptions where an attacker can arbitrarily modify a small fraction of the rewards and transitions in the dataset. We first derive a sufficient optimality condition for generalized Pessimistic Value Iteration (PEVI), which allows various estimators with proper confidence bounds and can be applied to multiple learning settings. In order to handle the data corruption and heavy-tailed reward setting, we prove that the trimmed-mean estimation achieves the minimax optimal error rate for robust mean estimation under heavy-tailed distributions. In the PEVI algorithm, we plug in the trimmed mean estimation and the confidence bound to solve the robust offline RL problem. Standard analysis reveals that data corruption induces a bias term in the suboptimality gap, which gives the false impression that any data corruption prevents optimal policy learning. By using the optimality condition for the generalized PEVI, we show that as long as the bias term is less than the “action gap”, the policy returned by PEVI achieves the optimal value given sufficient data.

[5] https://ojs.aaai.org/index.php/AAAI/article/view/29346

To ensure the usefulness of Reinforcement Learning (RL) in real systems, it is crucial to ensure they are robust to noise and adversarial attacks. In adversarial RL, an external attacker has the power to manipulate the victim agent’s interaction with the environment. We study the full class of online manipulation attacks, which include (i) state attacks, (ii) observation attacks (which are a generalization of perceived-state attacks), (iii) action attacks, and (iv) reward attacks. We show the attacker’s problem of designing a stealthy attack that maximizes its own expected reward, which often corresponds to minimizing the victim’s value, is captured by a Markov Decision Process (MDP) that we call a meta-MDP since it is not the true environment but a higher level environment induced by the attacked interaction. We show that the attacker can derive optimal attacks by planning in polynomial time or learning with polynomial sample complexity using standard RL techniques. We argue that the optimal defense policy for the victim can be computed as the solution to a stochastic Stackelberg game, which can be further simplified into a partially-observable turn-based stochastic game (POTBSG). Neither the attacker nor the victim would benefit from deviating from their respective optimal policies, thus such solutions are truly robust. Although the defense problem is NP-hard, we show that optimal Markovian defenses can be computed (learned) in polynomial time (sample complexity) in many scenarios.

[6] https://ojs.aaai.org/index.php/AAAI/article/view/29529

We characterize offline data poisoning attacks on Multi-Agent Reinforcement Learning (MARL), where an attacker may change a data set in an attempt to install a (potentially fictitious) unique Markov-perfect Nash equilibrium for a two-player zero-sum Markov game. We propose the unique Nash set, namely the set of games, specified by their Q functions, with a specific joint policy being the unique Nash equilibrium. The unique Nash set is central to poisoning attacks because the attack is successful if and only if data poisoning pushes all plausible games inside it. The unique Nash set generalizes the reward polytope commonly used in inverse reinforcement learning to MARL. For zero-sum Markov games, both the inverse Nash set and the set of plausible games induced by data are polytopes in the Q function space. We exhibit a linear program to efficiently compute the optimal poisoning attack. Our work sheds light on the structure of data poisoning attacks on offline MARL, a necessary step before one can design more robust MARL algorithms.

March 18, 2024

  • Prof Rahul Chatterjee gave an invited talk at Cylab (CMU) last month entitled “Privacy and Safety Issues of Smart Home Devices”.  [1]
  • Prof Loris D’Antoni gave a talk to a cohort of first-generation College students in the Center for Educational Opportunity (CeO) first-year seminar on how to succeed as a first-generation student. [2]
  • Prof Patrick McDaniel and colleagues published the paper “Host-based Flow Table Size Inference in Multi-hop SDN” (some time back) in GLOBECOM [picked up by DBLP scan].  [3]

Notes:

[1] https://www.cylab.cmu.edu/events/2024/02/19-seminar-chatterjee.html

Rahul Chatterjee

Assistant Professor in Computer Sciences, University of Wisconsin–Madison

Talk Title:

Privacy and Safety Issues of Smart Home Devices

Abstract:

Smart home devices (a.k.a. IoT devices), like Internet-connected thermostats, door locks, and item trackers, are transforming our personal spaces, offering convenience and automation. While these innovations enhance our lives, they also pose potential privacy and safety concerns. In this talk, we’ll dive deeper into these critical issues, exploring the privacy and safety risks posed by smart devices.

In the first part, I will talk about the privacy concerns posed by integrator services, such as Google Home, Apple HomeKit, IFTTT, and Zapier, which enable users to control smart home devices from different manufacturers and set up rules to interconnect them. These integrator services today have access to sensitive private data and the ability to control devices of all of their users. Thus, they are a ripe target for attack and a single point of failure in the IoT ecosystem should they be compromised. I will show how to minimize the privacy exposure to these services using novel data and metadata hiding techniques while maintaining their functionality and being compatible with the current IoT deployment model.

In the second part of the talk, I will focus on the safety concern with smart homes, especially in the intimate partner violence (IPV) contexts. We surveyed online forums, blog posts, and news articles and interviewed survivors of IPV and advocates who work with them to systematize the risk posed by smart home devices and how survivors cope with this new vector of abuse. We found that certain devices like hidden cameras and microphones are designed for surreptitious surveillance and easily available via large U.S. retailers like Amazon and Best Buy. Seemingly innocuous devices like smart thermostats and smart speakers are being used for spying and/or harassing the survivors. I will end with some open challenges for survivors today to combat smart home abuse.

[2] https://ceo.wisc.edu/

The Center for Educational Opportunity (CeO) empowers eligible students to engage and achieve through holistic advising, connecting students with campus resources, and fostering a supportive climate for all students.

CeO helps students progress through their academic careers at UW–Madison with the support of two federally funded TRIO grants and a state-funded program Student Support Services (SSS), Student Support Services-STEM (STEM), and the Academic Success and Achievement Program (ASAP).

Learn more about our mission and TRIO services.

[3] “Host-based Flow Table Size Inference in Multi-hop SDN”

Tian Xie, Sanchal Thakkar, Ting He, Novella Bartolini, Patrick McDaniel

GLOBECOM ‘23

https://globecom2023.ieee-globecom.org/

Paper:

https://bpb-us-e1.wpmucdn.com/sites.psu.edu/dist/a/74125/files/2023/04/Host-based-Flow-Table-Size-Inference-in-Multi-hop-SDN-Report.pdf

As a novel network paradigm, Software Defined Networking (SDN) has greatly simplified network management, but also introduced new vulnerabilities. One vulnerability of particular interest is the flow table, a data structure in every SDN-enabled switch that caches flow rules from the controller to bridge the speed gap between the data plane and the control plane. Prior works have shown that an adversary-controlled host can accurately infer parameters of the flow table at its directly-connected edge switch, which can then be used to launch intelligent attacks. However, those solutions do not work for flow tables at internal switches. In this work, we develop an algorithm that can infer the different flow table sizes at internal switches by measuring the Round Trip Times (RTTs) of a path traversing these switches from one of its endpoints. A major challenge in this problem is the lack of an inferable relationship between the RTTs and the flow table hits/misses at the traversed switches. Our solution addresses this challenge by experimentally identifying the inferable information and designing an inference algorithm that combines carefully designed probing sequences and statistical tools to mitigate measurement noise and interference. The efficacy of our solution is validated through experiments in Mininet.

March 11, 2024

  • Prof Rahul Chatterjee and colleague received an OVC grant for their proposal “Remote Tech Clinic to Combat Tech-Enabled Domestic Abuse in Rural Wisconsin”.  [1]
  • Prof Tom Reps and colleagues had the paper “CFLOBDDs: Context-Free-Language Ordered Binary Decision Diagrams” accepted into TOPLAS. The paper also was accepted as a “journal-first” contribution to PLDI 2024.  [2]
  • Prof Rahul Chatterjee and colleagues had the paper “Camouflage: Utility-Aware Obfuscation for Accurate Simulation of Sensitive Program Traces” accepted into TACO ‘24. [3] 
  • Prof Patrick McDaniel and colleagues had the paper “Characterizing the Modification Space of Signature IDS Rules” published in MILCOM. [4]

Notes:

[1] Kate Walsh (Psychology) and Rahul Chatterjee received an Office for Victims of Crime (OVC) grant from NIJ last year. 

https://ovc.ojp.gov/funding/awards/15povc-23-gk-01414-nonf

The current project proposes to create a remote tech clinic program housed at the University of Wisconsin-Madison to help combat technology-facilitated intimate partner violence (IPV) and provide direly needed support for victims of such abuse. The purpose of this research-practice partnership with victim service providers throughout the state of WI is to increase access to tech clinic support for survivors living in remote parts of WI.

Project activities will include (A1) semi-structured interviews with victim-survivors and advocates in remote and rural areas to understand the epidemiology of technology-facilitated abuse; (A2) use of participatory design techniques to build toolsets and protocols required for carrying out remote tech consultations with advocates and victims; (A3) development of training for tech clinic volunteers and IPV advocates so that they can use those tools effectively to support victims of technology abuse; and (A4) collection of data on the efficacy of the remote tech clinic program.

Expected outcomes of the project include (O1) reports and publications that include data collected from the interviews about technology-facilitated abuse and survivor needs as well as data collected to ascertain the efficacy of the tech clinic program; (O2) a toolkit that can be

implemented in tech clinics across the US; and (O3) training protocols for tech clinic volunteers and IPV advocates that could accompany these toolkits and be tailored for particular clinics. The current proposal seeks to provide services to all IPV victim-survivors and advocates throughout the state of Wisconsin; however, all toolkits and training protocols will be made publicly available so that other tech clinics can replicate the framework and increase access to such services throughout the country. We also will make statistical code publicly available to facilitate research transparency.

[2] “CFLOBDDs: Context-Free-Language Ordered Binary Decision Diagrams” 

Tom Reps, Meghana Sistla, Swarat Chaudhuri
ACM Transactions on Programming Languages and Systems (TOPLAS), 

https://dl.acm.org/doi/10.1145/3651157

It has also been accepted as a “journal-first” contribution to PLDI 2024 (ACM SIGPLAN International Conference on Programming Language Design and Implementation).

As explained in the related-work section, the paper describes something I [Reps] started to work on in 1998, but didn’t find a successful application at that time.  I went back to it from time to time, but overall it was very slowly cooked . . . 26 years in the making.

[3] “Camouflage: Utility-Aware Obfuscation for Accurate Simulation of Sensitive Program Traces”

Asmita Pal, Keerthana Desai, Rahul Chatterjee, and Joshua San Miguel

ACM Transactions on Architecture and Code Optimization (TACO)

https://dl-acm-org.ezproxy.library.wisc.edu/doi/10.1145/3650110

Trace-based simulation is a widely used methodology for system design exploration. It relies on realistic traces that represent a range of behaviors necessary to be evaluated, containing a lot of information about the application, its inputs and the underlying system on which it was generated. Consequently, generating traces from real-world executions risk leakage of sensitive information. To prevent this, traces can be obfuscated before release. However, this can undermine their ideal utility, i.e., how realistically a program behavior was captured. To address this, we propose Camouflage, a novel obfuscation framework, designed with awareness of the necessary architectural properties required to preserve trace utility, while ensuring secrecy of the inputs used to generate the trace. Focusing on memory access traces, our extensive evaluation on various benchmarks shows that camouflaged traces preserve the performance measurements of the original execution, with an average τ correlation of 0.66. We model input secrecy as an input indistinguishability problem and show that the average security loss is 7.8%, which is better than traces generated from the state-of-the-art.

[4] “Characterizing the Modification Space of Signature IDS Rules”

Ryan Guide, Eric Pauley, Yohan Beugin, Ryan Sheatsley, Patrick McDaniel

MILCOM 2023

[found by DBLP Scan]

https://arxiv.org/abs/2402.09644

Signature-based Intrusion Detection Systems (SIDSs) are traditionally used to detect malicious activity in networks. A notable example of such a system is Snort, which compares network traffic against a series of rules that match known exploits. Current SIDS rules are designed to minimize the amount of legitimate traffic flagged incorrectly, reducing the burden on network administrators. However, different use cases than the traditional one–such as researchers studying trends or analyzing modified versions of known exploits–may require SIDSs to be less constrained in their operation. In this paper, we demonstrate that applying modifications to real-world SIDS rules allow for relaxing some constraints and characterizing the performance space of modified rules. We develop an iterative approach for exploring the space of modifications to SIDS rules. By taking the modifications that expand the ROC curve of performance and altering them further, we show how to modify rules in a directed manner. Using traffic collected and identified as benign or malicious from a cloud telescope, we find that the removal of a single component from SIDS rules has the largest impact on the performance space. Effectively modifying SIDS rules to reduce constraints can enable a broader range of detection for various objectives, from increased security to research purposes.

March 4, 2024

  • Prof Rahul Chatterjee was awarded an NSF CAREER award for his proposal entitled “Account Security Against Interpersonal Attacks”.  [1] 
  • Prof Mohit Gupta was awarded an ONR Grant for his proposal entitled “Probing Materials from a Distance”. [2]
  • Prof Bart Miller gave a keynote at NDSS Binary Analysis Research Workshop entitled “Binary Code Patching: An Ancient Art Refined for the 21st Century”.  [3]
  • Prof Miron Livny and the CHTC team helped astronomers “study the Universe 300x faster” as part of their work on the NSF Path and Pelican projects.  [4]
  • Prof Bart Miller and many within CS and CDIS helped run another successful job fair, with more than 600 students attending. [5]

Notes:

[1] https://www.nsf.gov/awardsearch/showAward?AWD_ID=2339679&HistoricalAwards=false

“CAREER: Account Security Against Interpersonal Attacks”

Account security logs are designed by online services to help users detect if there was an unauthorized login to their accounts. However, current account security logs offer unreliable and coarse-grained information for users to differentiate their logins from attackers’ logins. Limited data, such as device model and approximate city or province based on IP addresses, can be easily spoofed by attackers, especially in cases of interpersonal attacks. Interpersonal attackers know the victims personally; they may be an intimate partner, family member, friend, or colleague. An interpersonal attacker may live in the same house or town and possess devices with identical models as the victim, making it challenging for victims to conclusively detect unauthorized logins by their attackers. The key problem here is twofold: (a) there is no unique and non-spoofable device identifier that preserves user privacy, and (b) humans and online services identify physical devices differently. In this project, we aim to tackle these issues by developing a framework of device identifiers that can uniquely identify a device while preserving users’ privacy and users are able to recognize and associate those ids with their respective physical device, bridging the disconnect between the methods of identifying devices by humans and software.

This project is designing, implementing, and evaluating novel ways to improve account security logs to enhance unauthorized login detection. The objectives of the project are to (a) design and implement a protocol for deriving unique yet privacy preserving identifiers of devices; (b) explore methods to familiarize users with such device identifiers without disrupting their user experience, and (c) redesign account security logs to incorporate such identifiers and measure their efficacy in detecting unauthorized logs. The broader impacts of the project include: (1) producing guidelines and engaging with developers of online services to enhance account security mechanisms, (2) deploying unauthorized login detection with Madison Tech Clinic (MTC) and Clinic to End Tech Abuse (CETA) in New York City to support survivors of IA, (3) teaching students about nuanced threat models, like those of IA, and (4) increasing the participation of students from minority backgrounds, such as women and LGBTQ+ students, by providing a space to engage and influence technologies with real-world impact on their lives.

[2] “Probing Materials from a Distance”

PI: Mohit Gupta; Co-PI: Shree Nayar (Columbia University)

Duration: 2024-2027

Award: $900K

Abstract: Today, computer vision technologies are being used in many aspects of our everyday lives. The data used by most computer vision systems are typically images and videos captured by conventional RGB cameras, and more recently, depth (RGBD) cameras. These cameras, along with modern advances in machine learning and deep learning techniques, have enabled vision systems to perform a wide range of tasks robustly, including object detection, segmentation and face identification. We believe an open and important problem in holistic scene understanding that complements these tasks is the recognition of materials. Our goal in this project is to design techniques and systems that are capable of producing RGBDM images (M for material), thereby recovering material properties at an individual pixel level. If we know that a pixel corresponds to wood, metal, concrete, paper, plant or skin, this information would greatly empower virtually all vision algorithms including segmentation, detection and recognition. To this end, we propose using high-speed time-resolved scene measurements, combined with polarization and spectrum cues. We will develop a comprehensive theory, practical computational techniques and hardware prototypes for recovering a wide range of material properties from a distance, using compact material probe devices. Finally, we will create large-scale material property datasets for several real-world materials, and design hybrid physics- and learning-based techniques for recovering material properties ‘in-the-wild’ from material probe measurements.

[3] https://www.ndss-symposium.org/ndss2024/co-located-events/bar/

“Binary Code Patching: An Ancient Art Refined for the 21st Century”

Bart Miller

Keynote, NDSS Binary Analysis Research Workshop

Bart also received a note from the workshop chair, Stefan  Nagy: “Thank you so much for a fantastic keynote! It was really fun  learning about the rich history of binary rewriting as well as the past, current, and future work your group has been doing in this space. I’m sure I can speak for Jun too in saying that your work on binary analysis and fuzzing have been huge inspirations to us and our students.”

[5] Spring Job Fair

Spring is the smaller of the two, but there were more than 600 attendees, predominantly from CS programs.

Thanks to Justin Hines for organizing, and Gigi Mitchell and the other staff and volunteers for running the event so smoothly, and to Bart Miller for starting this amazing series and for his continued involvement over many years.

As a quick recap, we hosted 26 companies, a 13% increase from last Spring and welcomed 619 students a whopping 49% increase from last Spring.  

February 26, 2024

Congratulations to Prof Xiangyao Yu, who received a Sloan Fellowship.  [1]

Prof Josiah Hanna gave a new faculty highlight talk at the AAAI Conference on “Scaling Offline Evaluation of Reinforcement Learning Agents through Abstraction”.  [2]

Prof Fred Sala and collaborators had the paper “Zero-Shot Robustification of Zero-Shot Models” accepted into ICLR ‘24.  [3] This paper is based on the workshop version of the paper which received a best paper (honorable mention) at the NeurIPS workshop on robustness in foundation models. 

Prof Mohit Gupta received his fourth consecutive Sony Faculty Innovation Award for his work on “Photon-Starved Scene Inference Using Single-Photon Cameras”.  [4] 

Notes:

[1] https://sloan.org/fellowships/2024-Fellows

Congratulations to the Sloan Research Fellows of 2024. The following 126 early-career scholars represent the most promising scientific researchers working today. Their achievements and potential place them among the next generation of scientific leaders in the U.S. and Canada. Winners receive $75,000, which may be spent over a two-year term on any expense supportive of their research.

Local news:

https://news.wisc.edu/two-uw-madison-researchers-receive-prestigious-sloan-fellowships/

[2] https://aaai.org/aaai-conference/nfh-24-program/

New faculty highlight talk at the AAAI Conference 

“Scaling Offline Evaluation of Reinforcement Learning Agents through Abstraction” 

The talk highlighted a line of work that my PhD student Brahma Pavse has been leading toward enabling confidence in learned decision-making.

More details:

A critical challenge for the widescale adoption of reinforcement learning (RL) is the need to give domain experts assurance that learned policies will improve decision-making — and not lead to unacceptable behavior. To meet this challenge, my work aims to develop new methods for offline policy evaluation in real world RL domains. There has been much recent interest in offline evaluation and many advances. However, recent benchmarking efforts have also shown that there remains a substantial gap between current state-of-the-art methods and real world domains such as robotics. Towards scalable offline evaluation, my group is investigating the use of methods for abstraction and representation learning. In this new faculty highlight, I will present our recent results that show the promise of this direction for scaling offline evaluation in RL domains. I will then describe future directions in this line of that work which will further realize the promise of offline policy evaluation for increasing confidence in deployed RL.

[3] Zero-Shot Robustification of Zero-Shot Models

https://arxiv.org/abs/2309.04344

ICLR 2024 (https://iclr.cc/)

Dyah Adila, Changho Shin, Linrong Cai, Frederic Sala

Abstract: Zero-shot inference is a powerful paradigm that enables the use of large pretrained models for downstream classification tasks without further training. However, these models are vulnerable to inherited biases that can impact their performance. The traditional solution is fine-tuning, but this undermines the key advantage of pretrained models, which is their ability to be used out-of-the-box. We propose a method that improves the robustness of pretrained model embeddings in a fully zero-shot fashion. First, we use language models (LMs) to obtain useful insights from task descriptions. These insights are embedded and used to remove harmful and boost useful components in embeddings—without any supervision. Theoretically, we provide a simple and tractable model for biases in zero-shot embeddings and give a result characterizing under what conditions our approach can boost performance. Empirically, we evaluate our technique on nine image and NLP classification tasks and show an average improvement of 15.98% on worst group accuracy, with trivial decrease in overall accuracy over several zero-shot baselines. Additionally, we demonstrate that our approach is compatible with a variety of pretrained and language models and propose a way to further boost performance with a zero-shot adaptation variant.

Workshop version: “Foundation Models Can Robustify Themselves, For Free”

Workshop on Robustness of Zero/Few-shot Learning in Foundation Models (R0-FoMo) at NeurIPS 2023 

Dyah Adila, Changho Shin, Linrong Cai, Frederic Sala

(Same abstract)

[4] Sony Faculty Innovation Award (1 year)

Title: Photon-Starved Scene Inference Using Single-Photon Cameras

PI: Mohit Gupta

Abstract: The goal of this proposal is to develop vision systems that can recover scene information under practical constraints such as limited power, latency and bandwidth. Imagine a robot exploring an underground cave, a robot arm performing inspection of dark machine parts, or a telescope imaging distant galaxies. In such extreme conditions, images captured by conventional cameras get overwhelmed by noise, causing the signal-to-noise ratio (SNR) to dip below the threshold required for downstream inference algorithms to extract meaningful scene information. To achieve these goals, we propose developing novel signal-processing, computer vision and machine learning algorithms for raw data captured by single-photon cameras. Broadly, the proposed work will consist of (a) designing algorithms for recovering low-level image features such as edges and corners directly from photon streams, (b) designing photon-processing and filtering algorithms for active single-photon imaging (e.g., to support long-range LiDAR), and (c) develop novel photon-stream representations for power- and bandwidth-efficient single-photon imaging (both passive and active scenarios).

February 12 and 19, 2024

  • Prof Steve Wright was elected into the National Academy of Engineering.  [1]
  • Prof Efty Sifakis and Prof Loris D’Antoni were named Vilas Associates.  [2] 
  • Prof Miron Livny and his work were mentioned prominently in the Chancellor’s address to the Board of Regents. [3] 
  • Prof Michael Ferris and colleagues published the paper “Foundry-ML – Software and Services to Simplify Access to Machine Learning Datasets in Materials Science” in the Journal of Open Source Software.  [4]

Prof Sharon Li had the paper “How Does Fine-Tuning Impact Out-of-Distribution Detection for Vision-Language Models?” appear in IJCV.  [5]

In addition, a great number of papers by our faculty have been made available on ArXiV these past few weeks (thanks to DBLP Scan[™] for this information). This includes:

  • Prof Ilias Diakonikolas and Prof Sharon Li’s paper on “How Does Unlabeled Data Provably Help Out-of-Distribution Detection?” [6]
  • Prof Josiah Hanna’s paper on “Future Prediction Can be a Strong Evidence of Good History Representation in Partially Observable Environments” [7] 
  • Prof Somesh Jha’s paper on “Do Large Code Models Understand Programming Concepts? A Black-box Approach” [8]
  • Prof Sharon Li’s paper on “HYPO: Hyperspherical Out-of-Distribution Generalization” (ICLR ‘24) [9]
  • Prof Ethan Cecchetti’s paper on “Computationally Bounded Robust Compilation and Universally Composable Security” (Computer Security Foundations Symposium) [10]
  • Prof Loris D’Antoni and Prof Tom Reps’s paper on “Automating Unrealizability Logic: Hoare-style Proof Synthesis for Infinite Sets of Programs” [11]
  • Prof Michael Gleicher and Prof Bilge Mutlu’s paper on “A System for Human-Robot Teaming through End-User Programming and Shared Autonomy” (HRI ‘24) [12]
  • Prof Yea-Seul Kim’s paper on “Erie: A Declarative Grammar for Data Sonification” (CHI ‘24) [13]
  • Prof Bilge Mutlu’s paper on “Toward Family-Robot Interactions: A Family-Centered Framework in HRI” [14]
  • Prof Fred Sala’s papers on “Multimodal Data Curation via Object Detection and Filter Ensembles” (TNGCV ‘23) and “Product Manifold Representations for Learning on Biological Pathways” [15]
  • Prof Efty Sifakis’s paper on “An Implicit Physical Face Model Driven by Expression and Style” (SIGGRAPH ASIA ‘23) [16]
  • Prof Shiv Venkataraman’s paper on “Decoding Speculative Decoding” [17]

Notes:

[1] https://www.nae.edu/312025/NAENewClass2024

Election to the National Academy of Engineering is among the highest professional distinctions accorded to an engineer. Academy membership honors those who have made outstanding contributions to “engineering research, practice, or education, including, where appropriate, significant contributions to the engineering literature” and to “the pioneering of new and developing fields of technology, making major advancements in traditional fields of engineering, or developing/implementing innovative approaches to engineering education.” Election of new NAE members is the culmination of a yearlong process. The ballot is set in December and the final vote for membership occurs during January.

[2] https://research.wisc.edu/professorships-and-faculty-fellowships/vilas-associates/

ELIGIBILITY: All tenure-track assistant professors and tenured faculty within 20 years of their tenure date are eligible to apply for this award.

ABOUT THE AWARD

The Vilas Associates Competition recognizes new and ongoing research of the highest quality and significance. Recipients are chosen competitively by the divisional Research Committees on the basis of a detailed proposal. Winners receive up to two-ninths of research salary support (including the associated fringe costs) for both summers 2024 and 2025, as well as a $12,500 flexible research fund in each of the two fiscal years. Faculty paid on an annual basis are not eligible for the summer salary support but are eligible for the flexible fund portion of this award. This award cannot be awarded more than one time to any faculty member.

Departments or programs with fewer than 15 faculty members, or a formally constituted interdepartmental group of six or more faculty members may nominate one individual. Departments, programs, or formally constituted interdepartmental groups with 15 or more faculty may nominate up to two individuals.

[3] https://chancellor.wisc.edu/remarks-mnookin/chancellor-mnookin-unveils-bold-new-initiatives-to-innovate-for-the-public-good-address-global-challenges/

“For example, Professor Miron Livny, whom you all know, pioneered high-throughput computing techniques that are now powering some of the world’s largest scientific experiments, including the search for cosmic neutrinos and black holes. Today, that work is part of what positions us to move forward in some exciting new directions.”

[4] https://joss.theoj.org/papers/10.21105/joss.05467

“Foundry-ML – Software and Services to Simplify Access to Machine Learning Datasets in Materials Science”

KJ Schmidt. Aristana Scourtas, Logan Ward, Steve Wangen, Marcus Schwarting, Isaac Darling,  Ethan Truelove, Aadit Ambadkar, Ribhav Bose, Zoa Katok, Jingrui Wei, Xiangguo Li, Ryan Jacobs, Lane Schultz, Doyeon Kim, Michael Ferris, Paul M. Voyles, Dane Morgan, Ian Foster, Ben Blaiszik

The application of open science and machine learning to scientific, engineering, and industry relevant problems is a critical component of the cross-department U.S. Artificial Intelligence (AI) strategy highlighted e.g., by the AI Initiative, the recent National AI Strategy report (“Strengthening and Democratizing the u.s. Artificial Intelligence Innovation Ecosystem – an Implementation Plan for a National Artificial Intelligence Research Resource,” 2023), the Year of Open Data, Materials Genome Initiative (Pablo et al., 2019; Ward & Warren, 2015), andmore. A key aspect of these strategies is to ensure that infrastructure exists to make datasets easily accessible for training, retraining, reproducing, and verifying model performance on chosen tasks. However, the discovery of high-quality, curated datasets adhering to the FAIR principles (findable, accessible, interoperable and reusable) remains a challenge.

To overcome these dataset access challenges, we introduce Foundry-ML, software that combines several services to provide researchers capabilities to publish and discover structured datasets for ML in science, specifically in materials science and chemistry. Foundry-ML consists of a Python client, a web app, and standardized metadata and file structures built using services including the Materials Data Facility(Blaiszik et al., 2016, 2019) and Globus (Ananthakrishnan et al., 2018; Chard et al., 2015). Together, these services work in conjunction with Python software tooling to dramatically simplify data access patterns, as we show below.

[5] How Does Fine-Tuning Impact Out-of-Distribution Detection for Vision-Language Models?

Yifei Ming, Yixuan Li

International Journal of Computer Vision

https://arxiv.org/abs/2306.06048

Recent large vision-language models such as CLIP have shown remarkable out-of-distribution (OOD) detection and generalization performance. However, their zero-shot in-distribution (ID) accuracy is often limited for downstream datasets. Recent CLIP-based fine-tuning methods such as prompt learning have demonstrated significant improvements in ID classification and OOD generalization where OOD labels are available. Nonetheless, it remains unclear whether the model is reliable to semantic shifts without OOD labels. In this paper, we aim to bridge the gap and present a comprehensive study to understand how fine-tuning impact OOD detection for few-shot downstream tasks. By framing OOD detection as multi-modal concept matching, we establish a connection between fine-tuning methods and various OOD scores. Our results suggest that a proper choice of OOD scores is essential for CLIP-based fine-tuning. In particular, the maximum concept matching (MCM) score provides a promising solution consistently. We also show that prompt learning demonstrates the state-of-the-art OOD detection performance over the zero-shot counterpart.

[6] https://arxiv.org/abs/2402.03502

How Does Unlabeled Data Provably Help Out-of-Distribution Detection?

Xuefeng Du, Zhen Fang, Ilias Diakonikolas, Yixuan Li

Using unlabeled data to regularize the machine learning models has demonstrated promise for improving safety and reliability in detecting out-of-distribution (OOD) data. Harnessing the power of unlabeled in-the-wild data is non-trivial due to the heterogeneity of both in-distribution (ID) and OOD data. This lack of a clean set of OOD samples poses significant challenges in learning an optimal OOD classifier. Currently, there is a lack of research on formally understanding how unlabeled data helps OOD detection. This paper bridges the gap by introducing a new learning framework SAL (Separate And Learn) that offers both strong theoretical guarantees and empirical effectiveness. The framework separates candidate outliers from the unlabeled data and then trains an OOD classifier using the candidate outliers and the labeled ID data. Theoretically, we provide rigorous error bounds from the lens of separability and learnability, formally justifying the two components in our algorithm. Our theory shows that SAL can separate the candidate outliers with small error rates, which leads to a generalization guarantee for the learned OOD classifier. Empirically, SAL achieves state-of-the-art performance on common benchmarks, reinforcing our theoretical insights. Code is publicly available at this https URL.

[7] https://www.arxiv.org/abs/2402.07102

Future Prediction Can be a Strong Evidence of Good History Representation in Partially Observable Environments

Jeongyeol Kwon, Liu Yang, Robert Nowak, Josiah Hanna

Learning a good history representation is one of the core challenges of reinforcement learning (RL) in partially observable environments. Recent works have shown the advantages of various auxiliary tasks for facilitating representation learning. However, the effectiveness of such auxiliary tasks has not been fully convincing, especially in partially observable environments that require long-term memorization and inference. In this empirical study, we investigate the effectiveness of future prediction for learning the representations of histories, possibly of extensive length, in partially observable environments. We first introduce an approach that decouples the task of learning history representations from policy optimization via future prediction. Then, our main contributions are two-fold: (a) we demonstrate that the performance of reinforcement learning is strongly correlated with the prediction accuracy of future observations in partially observable environments, and (b) our approach can significantly improve the overall end-to-end approach by preventing high-variance noisy signals from reinforcement learning objectives to influence the representation learning. We illustrate our claims on three types of benchmarks that necessitate the ability to process long histories for high returns.

[8] https://arxiv.org/abs/2402.05980

Do Large Code Models Understand Programming Concepts? A Black-box Approach

Ashish Hooda, Mihai Christodorescu, Miltos Allamanis, Aaron Wilson, Kassem Fawaz, Somesh Jha

Large Language Models’ success on text generation has also made them better at code generation and coding tasks. While a lot of work has demonstrated their remarkable performance on tasks such as code completion and editing, it is still unclear as to why. We help bridge this gap by exploring to what degree auto-regressive models understand the logical constructs of the underlying programs. We propose Counterfactual Analysis for Programming Concept Predicates (CACP) as a counterfactual testing framework to evaluate whether Large Code Models understand programming concepts. With only black-box access to the model, we use CACP to evaluate ten popular Large Code Models for four different programming concepts. Our findings suggest that current models lack understanding of concepts such as data flow and control flow.

[9] https://arxiv.org/abs/2402.07785

HYPO: Hyperspherical Out-of-Distribution Generalization

Haoyue Bai, Yifei Ming, Julian Katz-Samuels, Yixuan Li

Out-of-distribution (OOD) generalization is critical for machine learning models deployed in the real world. However, achieving this can be fundamentally challenging, as it requires the ability to learn invariant features across different domains or environments. In this paper, we propose a novel framework HYPO (HYPerspherical OOD generalization) that provably learns domain-invariant representations in a hyperspherical space. In particular, our hyperspherical learning algorithm is guided by intra-class variation and inter-class separation principles — ensuring that features from the same class (across different training domains) are closely aligned with their class prototypes, while different class prototypes are maximally separated. We further provide theoretical justifications on how our prototypical learning objective improves the OOD generalization bound. Through extensive experiments on challenging OOD benchmarks, we demonstrate that our approach outperforms competitive baselines and achieves superior performance. Code is available at this https URL.

[10] https://arxiv.org/pdf/2401.15041.pdf

Computationally Bounded Robust Compilation and Universally Composable Security

Robert Kunnemann, Marco Patrignani, Ethan Cecchetti

Computer Security Foundations Symposium, 2024

Abstract—Universal Composability (UC) is the gold standard for cryptographic security, but mechanizing proofs of UC is notoriously difficult. A recently-discovered connection between UC and Robust Compilation (RC )—a novel theory of secure compilation—provides a means to verify UC proofs using tools that mechanize equality results. Unfortunately, the existing methods apply only to perfect UC security, and real-world protocols relying on cryptography are only computationally secure. This paper addresses this gap by lifting the connection between UC and RC to the computational setting, extending techniques from the RC setting to apply to computational UC security. Moreover, it further generalizes the UC–RC connection beyond computational security to arbitrary equalities, providing a framework to subsume the existing perfect case, and to instantiate future theories with more complex notions of security. This connection allows the use of tools for proofs of computational indistinguishability to properly mechanize proofs of computational UC security. We demonstrate this power by using CRYPTOVERIF to mechanize a proof that parts of the Wireguard protocol are computationally UC secure. Finally, all proofs of the framework itself are verified in Isabelle/HOL.

[11] https://arxiv.org/abs/2401.13244

​​Automating Unrealizability Logic: Hoare-style Proof Synthesis for Infinite Sets of Programs

Shaan Nagy, Jinwoo Kim, Loris D’Antoni, Thomas Reps

Unrealizability logic (UL) was proposed by Kim et al. as the first Hoare-style proof system for proving properties that hold for an infinite set of programs (defined by a regular tree grammar). The goal of our work is to automate reasoning and proof generation for UL. A key ingredient in UL is the notion of nonterminal summaries-inductive facts that characterize recursive nonterminals in the grammar that defines the set of programs. They are analogous to procedure summaries in Hoare logic. The goal of automating UL led us to reformulate the inference rules-in particular, introducing a unified rule for nonterminal summaries, called the rule of adaptation, which draws inspiration from how procedure summaries are handled in Hoare logic. In the same way that verification conditions can be used to synthesize loop invariants for Hoare logic proofs, our reformulation of UL reduces the problem of synthesizing a nonterminal summary to a Syntax-Guided Synthesis problem. We implement Wuldo, the first checker and synthesizer for UL. Wuldo can express proofs beyond the reach of existing tools, including proofs that establish how infinitely many programs behave on infinitely many inputs, and in some cases Wuldo can even synthesize the needed nonterminal summaries.

[12] https://arxiv.org/abs/2401.12380

A System for Human-Robot Teaming through End-User Programming and Shared Autonomy

Michael Hagenow, Emmanuel Senft, Robert Radwin, Michael Gleicher, Michael Zinn, Bilge Mutlu

HRI ’24

Many industrial tasks-such as sanding, installing fasteners, and wire harnessing-are difficult to automate due to task complexity and variability. We instead investigate deploying robots in an assistive role for these tasks, where the robot assumes the physical task burden and the skilled worker provides both the high-level task planning and low-level feedback necessary to effectively complete the task. In this article, we describe the development of a system for flexible human-robot teaming that combines state-of-the-art methods in end-user programming and shared autonomy and its implementation in sanding applications. We demonstrate the use of the system in two types of sanding tasks, situated in aircraft manufacturing, that highlight two potential workflows within the human-robot teaming setup. We conclude by discussing challenges and opportunities in human-robot teaming identified during the development, application, and demonstration of our system.

[13] https://arxiv.org/abs/2402.00156

Erie: A Declarative Grammar for Data Sonification

Hyeok Kim, Yea-Seul Kim, Jessica Hullman

CHI ‘24

Data sonification-mapping data variables to auditory variables, such as pitch or volume-is used for data accessibility, scientific exploration, and data-driven art (e.g., museum exhibitions) among others. While a substantial amount of research has been made on effective and intuitive sonification design, software support is not commensurate, limiting researchers from fully exploring its capabilities. We contribute Erie, a declarative grammar for data sonification, that enables abstractly expressing auditory mappings. Erie supports specifying extensible tone designs (e.g., periodic wave, sampling, frequency/amplitude modulation synthesizers), various encoding channels, auditory legends, and composition options like sequencing and overlaying. Using standard Web Audio and Web Speech APIs, we provide an Erie compiler for web environments. We demonstrate the expressiveness and feasibility of Erie by replicating research prototypes presented by prior work and provide a sonification design gallery. We discuss future steps to extend Erie toward other audio computing environments and support interactive data sonification.

[14] https://arxiv.org/abs/2401.14478

Toward Family-Robot Interactions: A Family-Centered Framework in HRI

Bengisu Cagiltay, Bilge Mutlu

As robotic products become more integrated into daily life, there is a greater need to understand authentic and real-world human-robot interactions to inform product design. Across many domestic, educational, and public settings, robots interact with not only individuals and groups of users, but also families, including children, parents, relatives, and even pets. However, products developed to date and research in human-robot and child-robot interactions have focused on the interaction with their primary users, neglecting the complex and multifaceted interactions between family members and with the robot. There is a significant gap in knowledge, methods, and theories for how to design robots to support these interactions. To inform the design of robots that can support and enhance family life, this paper provides (1) a narrative review exemplifying the research gap and opportunities for family-robot interactions and (2) an actionable family-centered framework for research and practices in human-robot and child-robot interaction.

[15] https://arxiv.org/abs/2401.12225

Multimodal Data Curation via Object Detection and Filter Ensembles

Tzu-Heng Huang, Changho Shin, Sui Jiet Tay, Dyah Adila, Frederic Sala

Workshop of Towards the Next Generation of Computer Vision Datasets (TNGCV) 

We propose an approach for curating multimodal data that we used for our entry in the 2023 DataComp competition filtering track. Our technique combines object detection and weak supervision-based ensembling. In the first of two steps in our approach, we employ an out-of-the-box zero-shot object detection model to extract granular information and produce a variety of filter designs. In the second step, we employ weak supervision to ensemble filtering rules. This approach results in a 4% performance improvement when compared to the best-performing baseline, producing the top-ranking position in the small scale track at the time of writing. Furthermore, in the medium scale track, we achieve a noteworthy 4.2% improvement over the baseline by simply ensembling existing baselines with weak supervision.

https://arxiv.org/abs/2401.15478

Product Manifold Representations for Learning on Biological Pathways

Daniel McNeela, Frederic Sala, Anthony Gitter

Machine learning models that embed graphs in non-Euclidean spaces have shown substantial benefits in a variety of contexts, but their application has not been studied extensively in the biological domain, particularly with respect to biological pathway graphs. Such graphs exhibit a variety of complex network structures, presenting challenges to existing embedding approaches. Learning high-quality embeddings for biological pathway graphs is important for researchers looking to understand the underpinnings of disease and train high-quality predictive models on these networks. In this work, we investigate the effects of embedding pathway graphs in non-Euclidean mixed-curvature spaces and compare against traditional Euclidean graph representation learning models. We then train a supervised model using the learned node embeddings to predict missing protein-protein interactions in pathway graphs. We find large reductions in distortion and boosts on in-distribution edge prediction performance as a result of using mixed-curvature embeddings and their corresponding graph neural network models. However, we find that mixed-curvature representations underperform existing baselines on out-of-distribution edge prediction performance suggesting that these representations may overfit to the training graph topology. We provide our mixed-curvature product GCN code at this https URL and our pathway analysis code at this https URL.

[16] https://arxiv.org/abs/2401.15414

An Implicit Physical Face Model Driven by Expression and Style

Lingchen Yang, Gaspard Zoss, Prashanth Chandran, Paulo Gotardo, Markus Gross, Barbara Solenthaler, Eftychios Sifakis, Derek Bradley

SIGGRAPH ASIA ‘23

3D facial animation is often produced by manipulating facial deformation models (or rigs), that are traditionally parameterized by expression controls. A key component that is usually overlooked is expression ‘style’, as in, how a particular expression is performed. Although it is common to define a semantic basis of expressions that characters can perform, most characters perform each expression in their own style. To date, style is usually entangled with the expression, and it is not possible to transfer the style of one character to another when considering facial animation. We present a new face model, based on a data-driven implicit neural physics model, that can be driven by both expression and style separately. At the core, we present a framework for learning implicit physics-based actuations for multiple subjects simultaneously, trained on a few arbitrary performance capture sequences from a small set of identities. Once trained, our method allows generalized physics-based facial animation for any of the trained identities, extending to unseen performances. Furthermore, it grants control over the animation style, enabling style transfer from one character to another or blending styles of different characters. Lastly, as a physics-based model, it is capable of synthesizing physical effects, such as collision handling, setting our method apart from conventional approaches.

[17] https://arxiv.org/abs/2402.01528

Decoding Speculative Decoding

Minghao Yan, Saurabh Agarwal, Shivaram Venkataraman

Speculative Decoding is a widely used technique to speed up inference for Large Language Models (LLMs) without modifying its outcome. When performing inference on an LLM, speculative decoding uses a smaller draft model which generates speculative tokens and then uses the target LLM to verify those draft tokens. The speedup provided by speculative decoding heavily depends on the choice of the draft model. It has been widely suggested to select a draft model that provides a high probability of the generated token being accepted by the LLM to achieve the highest throughput. However, our experiments indicate the contrary with throughput diminishing as the probability of generated tokens to be accepted by the target model increases. To understand this phenomenon, we perform extensive experiments to characterize the different factors that affect speculative decoding and how those factors interact and affect the speedups. Based on our experiments we describe an analytical model which can be used to decide the right draft model for a given workload. Further, using our insights we design a new draft model for LLaMA-65B which can provide 30% higher throughput than existing draft models.

February 5, 2024

  • Prof Mike Swift was named a Vilas Distinguished Achievement Professor. [1]
  • Prof Swamit Tannu was awarded an NSF CAREER award for his work on “Enabling Scalable and Resilient Quantum Computer Architectures through Synergistic Hardware-Software Co-Design”. [2]
  • Prof Matt Sinclair is giving a talk entitled “Leveraging open source simulators to enable HW/SW co-design of next-generation HPC systems” at a DOE PI meeting. [3]
  • Prof Mohit Gupta has been awarded a patent for his work on “Systems, Methods, and Media for improving signal-to-noise ratio in single-photon data”.  [4]
  • Prof Yong Jae Lee has published the paper “Edit One for All: Interactive Batch Image Editing” [5]. 
  • Prof Mike Swift has published the paper “Cost-effective and performant virtual WANs with CORNIFER” [6]. 

Notes:

[1] https://provost.wisc.edu/vilas-distinguished-achievement-professorship/

Vilas Distinguished Achievement Professorships (VDAP) recognize UW–Madison faculty members whose distinguished scholarship has advanced the confines of knowledge, and whose excellence also includes teaching or service. Faculty members receiving this award keep the Vilas Distinguished Achievement Professorship title for the duration of their careers.

[2] NSF CAREER Awards

https://new.nsf.gov/funding/opportunities/faculty-early-career-development-program-career

CAREER: The Faculty Early Career Development (CAREER) Program is a Foundation-wide activity that offers the National Science Foundation’s most prestigious awards in support of early-career faculty who have the potential to serve as academic role models in research and education and to lead advances in the mission of their department or organization. Activities pursued by early-career faculty should build a firm foundation for a lifetime of leadership in integrating education and research. NSF encourages submission of CAREER proposals from early-career faculty at all CAREER-eligible organizations and especially encourages women, members of underrepresented minority groups, and persons with disabilities to apply.

Specifics:

“Enabling Scalable and Resilient Quantum Computer Architectures through Synergistic Hardware-Software Co-Design”

Swamit Tannu

NSF CAREER

Abstract – Quantum computers can help solve some of the most complex problems in physics, chemistry, material design, optimization, and machine learning. A worldwide effort is underway to create larger, more reliable quantum computers, enhancing their ability to execute complex quantum algorithms. Despite this progress, practical quantum computing faces significant challenges, including the susceptibility of quantum bit (qubit) devices to environmental noise, producing inaccurate results. Furthermore, the limited availability of algorithms that can leverage quantum computers in the near term is a significant hurdle, as existing algorithms require millions of noise-free operations beyond the current quantum hardware capabilities. The gap between the quantum hardware necessary for solving real-world problems and today’s quantum computing technology is significant. This award seeks to bridge this gap by focusing on scalable and resilient quantum computer architectures through an integrated approach to hardware and software design and developing software tools to help users leverage existing and future quantum hardware. Moreover, this award will create a broadly accessible quantum computing curriculum and support outreach activities to engage and educate undergraduates in quantum information sciences.

This project focuses on two research thrusts. The first thrust concentrates on co-designing Instruction Set Architecture (ISA) and Runtime to enable efficient and resilient architectures and introduce new hardware-software primitives to help scale distributed qubit control. On the ISA front, the project will investigate how flexible instruction sets offered by most quantum hardware platforms, which allow new gates by calibrating pulses, can be leveraged without overly increasing the calibration complexity. For enabling a large number of operations on a Fault-Tolerant Quantum Computer (FTQC), required by most real-world applications, this project will develop a runtime capable of detecting noise amplification events due to unstable qubit devices and mitigating them by moving data and learning optimal noise mitigation policies. The second thrust of this project will focus on developing software tools to facilitate efficient empirical design and evaluation of quantum algorithms. The project will develop tools to design, tune, and debug hybrid classical-quantum workflows. Moreover, the project will study the efficacy of partially fault-tolerant quantum computer architectures, where only a subset of operations are protected against errors, and investigate if co-designing ISA, runtime, error correction, and applications can push partial FTQC toward practical utility.

https://www.nsf.gov/awardsearch/showAward?AWD_ID=2340267&HistoricalAwards=false

[3] “Leveraging open source simulators to enable HW/SW co-design of next-generation HPC systems”

Talk by Matt Sinclair @ DOE ASCR CS PI Meeting

Abstract: We aim to establish a multi-fidelity simulation framework that incorporates best-in-class CMOS and post-CMOS accelerator and power models for both time and space, which improves simulation scalability and adjusts the level of detail required in the models for different components driven by key regions of interest in applications.

[4] “Systems, Methods, and Media for improving signal-to-noise ratio in single-photon data”

Authors: Jongho Lee and Mohit Gupta

(link to corresponding publication page). 

https://wisionlab.com/project/caspi/

[5] “Edit One for All: Interactive Batch Image Editing”

Thao Nguyen, Utkarsh Ojha, Yuheng Li, Haotian Liu, Yong Jae Lee

https://arxiv.org/abs/2401.10219

In recent years, image editing has advanced remarkably. With increased human control, it is now possible to edit an image in a plethora of ways; from specifying in text what we want to change, to straight up dragging the contents of the image in an interactive point-based manner. However, most of the focus has remained on editing single images at a time. Whether and how we can simultaneously edit large batches of images has remained understudied. With the goal of minimizing human supervision in the editing process, this paper presents a novel method for interactive batch image editing using StyleGAN as the medium. Given an edit specified by users in an example image (e.g., make the face frontal), our method can automatically transfer that edit to other test images, so that regardless of their initial state (pose), they all arrive at the same final state (e.g., all facing front). Extensive experiments demonstrate that edits performed using our method have similar visual quality to existing single-image-editing methods, while having more visual consistency and saving significant time and human effort.

[6] “Cost-effective and performant virtual WANs with CORNIFER”

Anjali, Rachee Singh, Michael M. Swift

https://arxiv.org/abs/2401.09620

Virtual wide-area networks (WANs) are WAN-as-a-service cloud offerings that aim to bring the performance benefits of dedicated wide-area interconnects to enterprise customers. In this work, we show that the topology of a virtual WAN can render it both performance and cost inefficient. We develop Cornifer, a tool that designs virtual WAN topologies by deciding the number of virtual WAN nodes and their location in the cloud to minimize connection latency at low cost to enterprises. By leveraging millions of latency measurements from vantage points across the world to cloud points of presence, Cornifer designs virtual WAN topologies that improve weighted client latency by 26% and lower cost by 28% compared to the state-of-the-art. Cornifer identifies virtual WAN topologies at the Pareto frontier of the deployment cost vs. connection latency trade-off and proposes a heuristic for automatic selection of Pareto-optimal virtual WAN topologies for enterprises.

January 29, 2024

[1] When Two Cameras Are a Crowd

By Jongho Lee, Mohit Gupta, Bhuvana Krishnaswamy, Suman Banerjee

Communications of the ACM, December 2023, Vol. 66 No. 12, Pages 72-82

https://cacm.acm.org/magazines/2023/12/278155-when-two-cameras-are-a-crowd/fulltext

Vision and robotics systems enabled by cameras that recover 3D scene geometry are revolutionizing several aspects of our lives via technologies such as autonomous transportation, robotic surgery, and ‘hands-free’ user interfaces. Modern 3D cameras are active devices, where a programmable light source emits coded illumination. The emitted light gets reflected from the scene and is received by a sensor to infer the 3D structure of the surroundings. In a multi-camera environment, such active 3D cameras may receive light from the sources of other cameras, resulting in large depth errors. This problem is becoming increasingly important due to the emergence of low-cost and compact active 3D cameras, which are becoming ubiquitous across a wide range of applications, from consumer devices to vehicular vision systems.

We observe that the multi-camera interference (MCI) problem shares several similarities and dissimilarities with common interference problems in the RF domain. Based on this observation, this article describes new and emerging challenges when multiple active 3D cameras operate in the same spatio-temporal region. The article also outlines some solutions, and more importantly, highlights the next steps.

[2] A hop away from everywhere: A view of the intercontinental long-haul infrastructure

Esteban Carisimo, Mia Weaver, Paul Barford, Fabián E. Bustamante

Proc. ACM Meas. Anal. Comput. Syst.

https://arxiv.org/abs/2303.02514

Over the past two decades, a desire to reduce transit cost, improve control over routing and performance, and enhance the quality of experience for users, has yielded a more densely connected, flat network with fewer hops between sources and destinations. The shortening of paths in terms of the number of hops or links has also meant, for what is at the end an infrastructure-bound network, the lengthening of many of these links. In this paper, we focus on an important aspect of the evolving logical connectivity of the Internet that has received little attention to date: intercontinental long-haul links. We develop a methodology and associated processing system for identifying long haul links in traceroute measurements. We apply this system to a large corpus of traceroute data and report on multiple aspects of long haul connectivity including country-level prevalence, routers as international gateways, preferred long-haul destinations, and the evolution of these characteristics over a 7 year period. We identify over 9K layer 3 links that satisfy our definition for intercontinental long haul with many of them terminating in a relatively small number of nodes. An analysis of connected components shows a clearly dominant one with a relative size that remains stable despite a significant growth of the long-haul infrastructure.

[3] Restricted Holant Dichotomy on Domains 3 and 4

Yin Liu, Austen Z. Fan, Jin-Yi Cai

COCOA 2023: Combinatorial Optimization and Applications

https://arxiv.org/abs/2307.16078

Holant∗(f) denotes a class of counting problems specified by a constraint function f. We prove complexity dichotomy theorems for Holant∗(f) in two settings: (1) f is any arity-3 real-valued function on input of domain size 3. (2) f is any arity-3 {0,1}-valued function on input of domain size 4.

[4] Asymmetric Transfer Hashing with Adaptive Bipartite Graph Learning

Jianglin Lu, Jie Zhou, Yudong Chen, Witold Pedrycz, Kwok-Wai Hung

IEEE Transactions on Cybernetics

https://arxiv.org/abs/2206.12592

Thanks to the efficient retrieval speed and low storage consumption, learning to hash has been widely used in visual retrieval tasks. However, existing hashing methods assume that the query and retrieval samples lie in homogeneous feature space within the same domain. As a result, they cannot be directly applied to heterogeneous cross-domain retrieval. In this paper, we propose a Generalized Image Transfer Retrieval (GITR) problem, which encounters two crucial bottlenecks: 1) the query and retrieval samples may come from different domains, leading to an inevitable {domain distribution gap}; 2) the features of the two domains may be heterogeneous or misaligned, bringing up an additional {feature gap}. To address the GITR problem, we propose an Asymmetric Transfer Hashing (ATH) framework with its unsupervised/semi-supervised/supervised realizations. Specifically, ATH characterizes the domain distribution gap by the discrepancy between two asymmetric hash functions, and minimizes the feature gap with the help of a novel adaptive bipartite graph constructed on cross-domain data. By jointly optimizing asymmetric hash functions and the bipartite graph, not only can knowledge transfer be achieved but information loss caused by feature alignment can also be avoided. Meanwhile, to alleviate negative transfer, the intrinsic geometrical structure of single-domain data is preserved by involving a domain affinity graph. Extensive experiments on both single-domain and cross-domain benchmarks under different GITR subtasks indicate the superiority of our ATH method in comparison with the state-of-the-art hashing methods.

[5] Clustering Mixtures of Bounded Covariance Distributions Under Optimal Separation

Ilias Diakonikolas, Daniel M. Kane, Jasper C. H. Lee, Thanasis Pittas

https://arxiv.org/abs/2312.11769

We study the clustering problem for mixtures of bounded covariance distributions, under a fine-grained separation assumption. Specifically, given samples from a k-component mixture distribution D=∑ki=1wiPi, where each wi≥α for some known parameter α, and each Pi has unknown covariance Σi⪯σ2i⋅Id for some unknown σi, the goal is to cluster the samples assuming a pairwise mean separation in the order of (σi+σj)/α‾‾√ between every pair of components Pi and Pj. Our contributions are as follows:

For the special case of nearly uniform mixtures, we give the first poly-time algorithm for this clustering task. Prior work either required separation scaling with the maximum cluster standard deviation (i.e. maxiσi) [DKK+22b] or required both additional structural assumptions and mean separation scaling as a large degree polynomial in 1/α [BKK22].

For general-weight mixtures, we point out that accurate clustering is information-theoretically impossible under our fine-grained mean separation assumptions. We introduce the notion of a clustering refinement — a list of not-too-small subsets satisfying a similar separation, and which can be merged into a clustering approximating the ground truth — and show that it is possible to efficiently compute an accurate clustering refinement of the samples. Furthermore, under a variant of the “no large sub-cluster” condition from in prior work [BKK22], we show that our algorithm outputs an accurate clustering, not just a refinement, even for general-weight mixtures. As a corollary, we obtain efficient clustering algorithms for mixtures of well-conditioned high-dimensional log-concave distributions.

Moreover, our algorithm is robust to Ω(α)-fraction of adversarial outliers.

[6] Agnostically Learning Multi-index Models with Queries

Ilias Diakonikolas, Daniel M. Kane, Vasilis Kontonis, Christos Tzamos, Nikos Zarifis

https://arxiv.org/abs/2312.16616

We study the power of query access for the task of agnostic learning under the Gaussian distribution. In the agnostic model, no assumptions are made on the labels and the goal is to compute a hypothesis that is competitive with the {\em best-fit} function in a known class, i.e., it achieves error opt+ϵ, where opt is the error of the best function in the class. We focus on a general family of Multi-Index Models (MIMs), which are d-variate functions that depend only on few relevant directions, i.e., have the form g(Wx) for an unknown link function g and a k×d matrix W. Multi-index models cover a wide range of commonly studied function classes, including constant-depth neural networks with ReLU activations, and intersections of halfspaces.

Our main result shows that query access gives significant runtime improvements over random examples for agnostically learning MIMs. Under standard regularity assumptions for the link function (namely, bounded variation or surface area), we give an agnostic query learner for MIMs with complexity O(k)poly(1/ϵ)poly(d). In contrast, algorithms that rely only on random examples inherently require dpoly(1/ϵ) samples and runtime, even for the basic problem of agnostically learning a single ReLU or a halfspace.

Our algorithmic result establishes a strong computational separation between the agnostic PAC and the agnostic PAC+Query models under the Gaussian distribution. Prior to our work, no such separation was known — even for the special case of agnostically learning a single halfspace, for which it was an open problem first posed by Feldman. Our results are enabled by a general dimension-reduction technique that leverages query access to estimate gradients of (a smoothed version of) the underlying label function.

[7] Manually Acquiring Targets from Multiple Viewpoints Using Video Feedback

Bailey Ramesh, Anna Konstant, Pragathi Praveena, Emmanuel Senft, Michael Gleicher, Bilge Mutlu, Michael Zinn, Robert G. Radwin

Human Factors

https://arxiv.org/abs/2204.06969

Objective: The effect of camera viewpoint was studied when performing visually obstructed psychomotor targeting tasks. Background: Previous research in laparoscopy and robotic teleoperation found that complex perceptual-motor adaptations associated with misaligned viewpoints corresponded to degraded performance in manipulation. Because optimal camera positioning is often unavailable in restricted environments, alternative viewpoints that might mitigate performance effects are not obvious. Methods: A virtual keyboard-controlled targeting task was remotely distributed to workers of Amazon Mechanical Turk. The experiment was performed by 192 subjects for a static viewpoint with independent parameters of target direction, Fitts’ law index of difficulty, viewpoint azimuthal angle (AA), and viewpoint polar angle (PA). A dynamic viewpoint experiment was also performed by 112 subjects in which the viewpoint AA changed after every trial. Results: AA and target direction had significant effects on performance for the static viewpoint experiment. Movement time and travel distance increased while AA increased until there was a discrete improvement in performance for 180°. Increasing AA from 225° to 315° linearly decreased movement time and distance. There were significant main effects of current AA and magnitude of transition for the dynamic viewpoint experiment. Orthogonal direction and no-change viewpoint transitions least affected performance. Conclusions: Viewpoint selection should aim to minimize associated rotations within the manipulation plane when performing targeting tasks whether implementing a static or dynamic viewing solution. Because PA rotations had negligible performance effects, PA adjustments may extend the space of viable viewpoints. Applications: These results can inform viewpoint-selection for visual feedback during psychomotor tasks.

[8] Identifying and Mitigating the Security Risks of Generative AI

28 Aug 2023

Clark Barrett, Brad Boyd, Elie Burzstein, Nicholas Carlini, Brad Chen, Jihye Choi, Amrita Roy Chowdhury, Mihai Christodorescu, Anupam Datta, Soheil Feizi, Kathleen Fisher, Tatsunori Hashimoto, Dan Hendrycks, Somesh Jha, Daniel Kang, Florian Kerschbaum, Eric Mitchell, John Mitchell, Zulfikar Ramzan, Khawaja Shams, Dawn Song, Ankur Taly, Diyi Yang 

Found. Trends Priv. Secur

https://paperswithcode.com/paper/identifying-and-mitigating-the-security-risks

Every major technical invention resurfaces the dual-use dilemma — the new technology has the potential to be used for good as well as for harm. Generative AI (GenAI) techniques, such as large language models (LLMs) and diffusion models, have shown remarkable capabilities (e.g., in-context learning, code-completion, and text-to-image generation and editing). However, GenAI can be used just as well by attackers to generate new attacks and increase the velocity and efficacy of existing attacks. This paper reports the findings of a workshop held at Google (co-organized by Stanford University and the University of Wisconsin-Madison) on the dual-use dilemma posed by GenAI. This paper is not meant to be comprehensive, but is rather an attempt to synthesize some of the interesting findings from the workshop. We discuss short-term and long-term goals for the community on this topic. We hope this paper provides both a launching point for a discussion on this important topic as well as interesting problems that the research community can work to address.

[9] Exploring the Capabilities of a General-Purpose Robotic Arm in Chess Gameplay.

Humanoids

https://www.youtube.com/watch?v=GsLxrXSdOqA&ab_channel=KIMLAB

[10] How to Overcome Curse-of-Dimensionality for Out-of-Distribution Detection?

How to Overcome Curse-of-Dimensionality for Out-of-Distribution Detection?

Soumya Suvra Ghosal, Yiyou Sun, Yixuan Li

https://arxiv.org/abs/2312.14452

Machine learning models deployed in the wild can be challenged by out-of-distribution (OOD) data from unknown classes. Recent advances in OOD detection rely on distance measures to distinguish samples that are relatively far away from the in-distribution (ID) data. Despite the promise, distance-based methods can suffer from the curse-of-dimensionality problem, which limits the efficacy in high-dimensional feature space. To combat this problem, we propose a novel framework, Subspace Nearest Neighbor (SNN), for OOD detection. In training, our method regularizes the model and its feature representation by leveraging the most relevant subset of dimensions (i.e. subspace). Subspace learning yields highly distinguishable distance measures between ID and OOD data. We provide comprehensive experiments and ablations to validate the efficacy of SNN. Compared to the current best distance-based method, SNN reduces the average FPR95 by 15.96% on the CIFAR-100 benchmark.

[11] The Efficacy of Transformer-based Adversarial Attacks in Security Domains

Kunyang Li, Kyle Domico, Jean-Charles Noirot Ferrand, Patrick McDaniel

IEEE Military Communications Conference (MILCOM), AI for Cyber Workshop, 2023

https://arxiv.org/abs/2310.11597

Today, the security of many domains rely on the use of Machine Learning to detect threats, identify vulnerabilities, and safeguard systems from attacks. Recently, transformer architectures have improved the state-of-the-art performance on a wide range of tasks such as malware detection and network intrusion detection. But, before abandoning current approaches to transformers, it is crucial to understand their properties and implications on cybersecurity applications. In this paper, we evaluate the robustness of transformers to adversarial samples for system defenders (i.e., resiliency to adversarial perturbations generated on different types of architectures) and their adversarial strength for system attackers (i.e., transferability of adversarial samples generated by transformers to other target models). To that effect, we first fine-tune a set of pre-trained transformer, Convolutional Neural Network (CNN), and hybrid (an ensemble of transformer and CNN) models to solve different downstream image-based tasks. Then, we use an attack algorithm to craft 19,367 adversarial examples on each model for each task. The transferability of these adversarial examples is measured by evaluating each set on other models to determine which models offer more adversarial strength, and consequently, more robustness against these attacks. We find that the adversarial examples crafted on transformers offer the highest transferability rate (i.e., 25.7% higher than the average) onto other models. Similarly, adversarial examples crafted on other models have the lowest rate of transferability (i.e., 56.7% lower than the average) onto transformers. Our work emphasizes the importance of studying transformer architectures for attacking and defending models in security domains, and suggests using them as the primary architecture in transfer attack settings.

[12] Manually Acquiring Targets from Multiple Viewpoints Using Video Feedback

Bailey Ramesh, Anna Konstant, Pragathi Praveena, Emmanuel Senft, Michael Gleicher, Bilge Mutlu, Michael Zinn, Robert G. Radwin

Human Factors

https://arxiv.org/abs/2204.06969

Objective: The effect of camera viewpoint was studied when performing visually obstructed psychomotor targeting tasks. Background: Previous research in laparoscopy and robotic teleoperation found that complex perceptual-motor adaptations associated with misaligned viewpoints corresponded to degraded performance in manipulation. Because optimal camera positioning is often unavailable in restricted environments, alternative viewpoints that might mitigate performance effects are not obvious. Methods: A virtual keyboard-controlled targeting task was remotely distributed to workers of Amazon Mechanical Turk. The experiment was performed by 192 subjects for a static viewpoint with independent parameters of target direction, Fitts’ law index of difficulty, viewpoint azimuthal angle (AA), and viewpoint polar angle (PA). A dynamic viewpoint experiment was also performed by 112 subjects in which the viewpoint AA changed after every trial. Results: AA and target direction had significant effects on performance for the static viewpoint experiment. Movement time and travel distance increased while AA increased until there was a discrete improvement in performance for 180°. Increasing AA from 225° to 315° linearly decreased movement time and distance. There were significant main effects of current AA and magnitude of transition for the dynamic viewpoint experiment. Orthogonal direction and no-change viewpoint transitions least affected performance. Conclusions: Viewpoint selection should aim to minimize associated rotations within the manipulation plane when performing targeting tasks whether implementing a static or dynamic viewing solution. Because PA rotations had negligible performance effects, PA adjustments may extend the space of viable viewpoints. Applications: These results can inform viewpoint-selection for visual feedback during psychomotor tasks.

[13] A Symmetric Multigrid-Preconditioned Krylov Subspace Solver for Stokes Equation

Yutian Tao, Eftychios Sifakis

https://arxiv.org/abs/2312.10615

Numerical solution of discrete PDEs corresponding to saddle point problems is highly relevant to physical systems such as Stokes flow. However, scaling up numerical solvers for such systems is often met with challenges in efficiency and convergence. Multigrid is an approach with excellent applicability to elliptic problems such as the Stokes equations, and can be a solution to such challenges of scalability and efficiency. The degree of success of such methods, however, is highly contingent on the design of key components of a multigrid scheme, including the hierarchy of discretizations, and the relaxation scheme used. Additionally, in many practical cases, it may be more effective to use a multigrid scheme as a preconditioner to an iterative Krylov subspace solver, as opposed to striving for maximum efficacy of the relaxation scheme in all foreseeable settings. In this paper, we propose an efficient symmetric multigrid preconditioner for the Stokes Equations on a staggered finite-difference discretization. Our contribution is focused on crafting a preconditioner that (a) is symmetric indefinite, matching the property of the Stokes system itself, (b) is appropriate for preconditioning the SQMR iterative scheme, and (c) has the requisite symmetry properties to be used in this context. In addition, our design is efficient in terms of computational cost and facilitates scaling to large domains.

[14] Characterizing Physical Memory Fragmentation

Mark Mansi, Michael M. Swift

https://arxiv.org/abs/2401.03523

External fragmentation of physical memory occurs when adjacent differently sized regions of allocated physical memory are freed at different times, causing free memory to be physically discontiguous. It can significantly degrade system performance and efficiency, such as reducing the ability to use huge pages, a critical optimization on modern large-memory system. For decades system developers have sought to avoid and mitigate fragmentation, but few prior studies quantify and characterize it in production settings.

Moreover, prior work often artificially fragments physical memory to create more realistic performance evaluations, but their fragmentation methodologies are ad hoc and unvalidated. Out of 13 papers, we found 11 different methodologies, some of which were subsequently found inadequate. The importance of addressing fragmentation necessitates a validated and principled methodology.

Our work fills these gaps in knowledge and methodology. We conduct a study of memory fragmentation in production by observing 248 machines in the Computer Sciences Department at University of Wisconsin – Madison for a week. We identify six key memory usage patterns, and find that Linux’s file cache and page reclamation systems are major contributors to fragmentation because they often obliviously break up contiguous memory. Finally, we create andúril, a tool to artificially fragment memory during experimental research evaluations. While andúril ultimately fails as a scientific tool, we discuss its design ideas, merits, and failings in hope that they may inspire future research.

December 18, 2023

Prof Jelena Diakonikolas was named an AFOSR Young Investigator for her proposal “Towards Fine‐Grained Complexity of Nonsmooth Optimization”. [1]

Note:

[1] https://www.afrl.af.mil/News/Article-Display/Article/3625080/afosr-awards-215m-to-scientists-engineers-via-young-investigator-program/

The Air Force Office of Scientific Research, or AFOSR, the basic research arm of the Air Force Research Laboratory, or AFRL, will award approximately $21.5 million in grants to 48 scientists and engineers from 36 institutions and businesses in 20 states under the fiscal year 2024 Young Investigator Program, or YIP.

December 11, 2023

Professor Ming Liu and students and colleagues had the paper “Understanding Routable PCIe Performance for Composable Infrastructures” accepted into NSDI ‘24. [1]

Professor Paul Barford and students and colleagues had the paper “An Elemental Decomposition of DNS Name-to-IP Graphs” accepted into Infocom ‘24.  [2]

The Internet Measurement Conference (IMC) will be held in Madison in summer of 2025. Prof Paul Barford will serve as General Chair. [3]

Professor Andrea Arpaci-Dusseau and Professor Remzi Arpaci-Dusseau and students had the paper “Symbiosis: The Art of Application and Kernel Cache Cooperation” accepted into FAST ‘24. [4]

Notes:

[1] Understanding Routable PCIe Performance for Composable Infrastructures

Authors: Wentao Hou, Jie Zhang, Zeke Wang, and Ming Liu

NSDI ‘24

https://www.usenix.org/conference/nsdi24

Routable PCIe has become the predominant cluster interconnect to build emerging composable infrastructures. Empowered by PCIe non-transparent bridge devices, PCIe transactions can traverse multiple switching domains, enabling a server to elastically integrate a number of remote PCIe devices as local ones. However, it is unclear how to move data or perform communication efficiently over the routable PCIe fabric without understanding its capabilities and limitations.

This paper presents the design and implementation of rPCIeBench, a software-hardware co-designed benchmarking framework to systematically characterize the routable PCIe fabric. rPCIeBench provides flexible data communication privities, exposes end-to-end PCIe transition observability, and enables reconfigurable experiment deployment. Using rPCIeBench, we first analyze the communication characteristics of a routable PCIe path, quantify its performance tax, and compare it with the local PCIe link. We then use it to dissect in-fabric traffic orchestration behaviors and draw three interesting findings: approximate max-min bandwidth partition, fast end-to-end bandwidth synchronization, and interference-free among orthogonal data paths. Finally, we encode gathered characterization insights as traffic orchestration rules and develop an edge-constrained relaxing algorithm to estimate flow transmission performance over a shared fabric. We validate its accuracy and demonstrate its potential to serve as an optimization guide to design effective flow schedulers.

[2] “An Elemental Decomposition of DNS Name-to-IP Graphs”

Alex Anderson, Aadi Mondal, Paul Barford, Mark Crovella and Joel Sommers

To Appear in IEEE Infocom, 2024.

The Domain Name System (DNS) is a critical piece of Internet infrastructure with remarkably complex properties and uses, and accordingly has been extensively studied.  In this study we contribute to that body of work by organizing and analyzing records maintained within the DNS as a bipartite graph.  We find that relating names and addresses in this way uncovers a surprisingly rich structure.  In order to characterize that structure, we introduce a new graph decomposition for DNS name-to-IP mappings, which we term elemental decomposition.  In particular, we argue that (approximately) decomposing this graph into bicliques — maximally connected components — exposes this rich structure.  We utilize large-scale censuses of the DNS to investigate the characteristics of the resulting decomposition, and illustrate how the exposed structure sheds new light on a number of questions about how the DNS is used in practice and suggests several new directions for future research.

[3] https://www.sigcomm.org/events/imc-conference

The Internet Measurement Conference is an annual conference focusing on Internet measurement and analysis, sponsored by ACM SIGCOMM and ACM SIGMETRICS. The aim is that papers presented at the conference contribute to the current understanding of how to collect or analyze Internet measurements, or give insight into how the Internet behaves.

IMC was begun as a workshop in 2001 in response to the difficulty at that time finding appropriate publication/presentation venues for high-quality Internet measurement research in general, and frustration with th  annual ACM SIGCOMM conference’s treatment of measurement submissions in particular.

The conference is generally held in late October or early November. In its first two years as a Workshop, attendance was limited, but now as a conference it is open to all interested in attending.

Papers are submitted in May with notifications in July and camera-ready papers due in August. The acceptance rate to date has been around 25%. IMC is usually open to both full-length papers and extended abstracts, with the latter being significantly shorter and intended to convey work-in progress and less mature results.

To encourage broader data sharing in the community, each year IMC will present a best paper award for the top paper that makes its data sets publically available by the time of camera ready submission. Several papers are selected for “fast-track” submission to IEEE/ACM Transactions on Networking. In addition, IMC is committed to offering a number of student travel grants to foster the participation of students.

[4] Symbiosis: The Art of Application and Kernel Cache Cooperation

Yifan Dai, Jing Liu, Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau

FAST ‘24

https://www.usenix.org/conference/fast24

Abstract. We introduce Symbiosis, a framework for key-value storage systems that dynamically configures application and kernel cache sizes to improve performance. We integrate Symbiosis into three production systems – LevelDB, WiredTiger, and RocksDB – and, through a series of experiments on various workloads and environments, show that Symbiosis improves performance by 1.5× on average and over 5× at best compared to static configurations, across a wide range of synthetic and real-world workloads.

November 27, 2023

Prof Karu Sankaralingam was named an IEEE Fellow for “contributions to identifying and mitigating the challenges of dark silicon”. [1]

Prof Karu Sankaralingam and others from UW-Madison have two papers at ASPLOS and one at HPCA. Karu’s papers include “A Journey of a 1,000 Kernels Begins with a Single Step: A Retrospective of Deep Learning on GPUs” at ASPLOS and “WASP: Exploiting Pipeline Parallelism with GPU Hardware and Compiler Support”. [2]

Prof Jin-Yi Cai and collaborators published “Counting Cycles on Planar Graphs in Subexponential Time” (Algorithmica) and “Bounded Degree Nonnegative Counting CSP” in the ACM Transactions on Computation Theory. [3]

Notes:

[1] Prof Karu Sankaralingam named an IEEE Fellow:

General info:https://www.comsoc.org/membership/ieee-fellow

This year’s class:

https://www.ieee.org/content/dam/ieee-org/ieee/web/org/about/fellows/2024-fellow-class.pdf

[2] “Carat: Unlocking Value-Level Parallelism for Multiplier-Free GEMMs”

Di Wu (UCF), Zhewen Pan, Joshua San Miguel (UW-Madison ECE)

ASPLOS 2024

In recent years, to deliver better performance and efficiency in deep neural networks, hardware architectures optimized for general matrix multiplication (GEMM) have been well studied. However, with the trend of batched, low-precision data, e.g., FP8 format in this work, we observe that there is growing untapped potential for value reuse. We propose a novel computing paradigm, value-level parallelism, whereby unique products are computed only once, and different inputs subscribe to (select) their products via temporal coding. Our architecture, Carat, employs value-level parallelism and transforms multiplication into accumulation, performing GEMMs with efficient multiplier-free hardware. Experiments show that, on average, Carat improves iso-area throughput and energy efficiency by 1.02x and 1.06x over a conventional systolic array and 3.2x and 4.3x when scaled up to multiple nodes.

“A Journey of a 1,000 Kernels Begins with a Single Step: A Retrospective of Deep Learning on GPUs”

Michael Davies, Selvaraj Anandaraj, Ian McDougall, Deep Machchhar, Rithik Jain, Karu Sankaralingam

ASPLOS 2024

We are in age of AI, with rapidly changing algorithms and a somewhat synergistic change in hardware. MLPerf is a recent benchmark suite that serves as a way to compare and evaluate hardware. However it has several drawbacks – it is dominated by CNNs and does a poor job of capturing the diversity of AI use cases, and only represents a sliver of production AI use cases. This paper performs a longitudinal study of state-of-art AI applications spanning vision, physical simulation, vision synthesis, language and speech processing, and tabular data processing, across three generations of hardware to understand how the AI revolution has panned out. The paper reports on data gathered at the framework level, device API level, and hardware and microarchitecture level. The paper provides insights on the hardware-software revolution with pointers to future trends.

“WASP: Exploiting Pipeline Parallelism with GPU Hardware and Compiler Support”

Neal Crago (NVIDIA), Sana Damani (NVIDIA), Karu Sankaralingam, Stephen W. Keckler (NVIDIA)

HPCA 2024

Graphics processing units (GPUs) are an important class of parallel processors that offer high compute throughput and memory bandwidth. GPUs are used in a variety of important computing domains, such as machine learning, high performance computing, sparse linear algebra, autonomous vehicles, and robotics. However, some applications from these domains can underperform due to sensitivity to memory latency and bandwidth. Some of this sensitivity can be reduced by better overlapping memory access with compute. Current GPUs often leverage pipeline parallelism in the form of warp specialization to enable better overlap. However, current pipeline parallelism support on GPUs is limited in three ways. First, warp specialization is a complex and manual transformation that is out of reach for many applications and developers. Second, it is limited to coarse-grained transfers between global memory and the shared memory scratchpad (SMEM); fine-grained memory access patterns are not well supported. Finally, the GPU hardware is unaware of the pipeline parallelism expressed by the programmer, and is unable to take advantage of this information to make better decisions at runtime.  In this paper we introduce WASP, hardware and compiler support for warp specialization that addresses these limitations. WASP enables fine-grained streaming and gather memory access patterns through the use of warp-level register file queues. Explicit warp to pipeline stage naming enables the GPU to be aware of pipeline parallelism, which WASP capitalizes on by designing pipeline-aware warp mapping, register allocation, and scheduling. Finally, we design and implement a compiler that can automatically generate warp specialized kernels, reducing programmer burden. Overall, we find that performance can be improved on a variety of important applications by an average of 38% over a modern GPU baseline.

[3] “Counting Cycles on Planar Graphs in Subexponential Time”

Jin-Yi Cai, Ashwin Maran

Algorithmica

https://trebuchet.public.springernature.app/get_content/5306b7e4-b038-4247-8e22-6ee5149c4b31

We study the problem of counting all cycles or self- avoiding walks (SAWs) on triangulated planar graphs. We present a subexponential 2O(√n) time algorithm for this counting problem. Among the technical ingredients used in this algorithm are the planar separator theorem and a delicate analysis using pairs of Motzkin paths and Motzkin numbers. We can then adapt this algorithm to uniformly sample SAWs, in subexponential time. Our work is motivated by the problem of gerrymandered districting maps.

“Bounded Degree Nonnegative Counting CSP”

Jin-Yi Cai, Daniel P. Szabo

ACM Transactions on Computation Theory

https://dl.acm.org/doi/10.1145/3632184

Constraint satisfaction problems (CSP) encompass an enormous variety of computational problems. In particular, all partition functions from statistical physics, such as spin systems, are special cases of counting CSP (#CSP). We prove a complete complexity classification for every counting problem in #CSP with nonnegative valued constraint functions that is valid when every variable occurs a bounded number of times in all constraints. We show that, depending on the set of constraint functions F, every problem in the complexity class #CSP(F) defined by F is either polynomial-time computable for all instances without the bounded occurrence restriction, or is #P-hard even when restricted to bounded degree input instances. The constant bound in the degree depends on F. The dichotomy criterion on F is decidable. As a second contribution, we prove a slightly modified but more streamlined decision procedure (from [14]) to test for the tractability of #CSP(F). This procedure on an input F tells us which case holds in the dichotomy for #CSP(F). This more streamlined decision procedure enables us to fully classify a family of directed weighted graph homomorphism problems. This family contains both P-time tractable problems and #P-hard problems. To our best knowledge, this is the first family of such problems explicitly classified that are not acyclic, thereby the Lovász-goodness criterion of Dyer-Goldberg-Paterson [24] cannot be applied.

 

November 13, 2023

Prof Loris D’Antoni gave the keynote at SAS ‘23 entitled “Verifying Infinitely Many Programs at Once”. [1]

Prof Bart Miller and colleagues hosted a cybersecurity ransomware simulation (in cooperation with U.S. Homeland Security, the FBI, and the National Guard) for students in his class and also got some nice press coverage for it.  [2]

Prof Sharon Li and Prof Patrick McDaniel participated in a TEDx Talk on campus entitled “Balancing AI and Technology in the Modern World”.  [3]

Notes:

[1] “Verifying Infinitely Many Programs at Once”

Loris D’Antoni

Details: Here

In traditional program verification, the goal is to automati- cally prove whether a program meets a given property. However, in some cases one might need to prove that a (potentially infinite) set of programs meets a given property. For example, to establish that no program in a set of possible programs (i.e., a search space) is a valid solution to a synthesis problem specification, e.g., a property φ, one needs to verify that all programs in the search space are incorrect, e.g., satisfy the property ¬φ. The need to verify multiple programs at once also arises in other domains such as reasoning about partially specified code (e.g., in the presence of library functions) and self-modifying code. This paper discusses our recent work in designing systems for verifying properties of infinitely many programs at once

[2] Media coverage of Prof Bart Miller’s Cyber TTX Exercise on Nov 7: NBC15 | WKOW

“UW-Madison hosted a cyber security exercise as ransomware attacks are on the rise in Wisconsin.

‘Ransomware is rampant and increasing,’ UW Madison Computer Science Professor Barton Miller said. ‘In the real world it’s not a question of if you’re going to be attacked or not. You will.’

American Family Insurance, Kwik Trip and Rock County are three organizations that dealt with ransomware attacks in October. “

Details:  NBC15 | WKOW

[3] “Balancing AI and Technology in the Modern World”

November 6th, 2023

Tripp Commons at Memorial Union in Madison, Wisconsin

6:00 PM CST to 8:00 PM CST

Details: https://www.tedxuwmadison.com/

TEDxUWMadison will be hosting its annual event this year with the theme: “Balancing AI and Technology in the Modern World.” We aim to explore how advancements in AI and technology have the potential to either perpetuate or mitigate biases in society. Additionally, we want to delve into the profound ways in which these innovations have transformed human interaction and societal dynamics.

October 30, 2023

Work done by Prof Bilge Mutlu and colleagues at U. Rochester, MIT, and elsewhere received the Ubicomp 10-Year Impact Award at the 2023 ACM Ubicomp Conference. [1]

Prof Paul Barford, Prof Patrick McDaniel, and their student had their paper “The CVE Wayback Machine: Measuring Coordinated Disclosure from Exploits Against 2 Years of Zero-Days” recognized as runner-up for the IMC best paper award. [2]

Prof Somesh Jha and colleagues recently made available the first publicly-detectable watermarking approach for LLMs in their paper “Publicly Detectable Watermarking for Language Models”. [3].

Prof Matt Sinclair and colleagues had the paper “Fifty Years of ISCA: A Data-driven Retrospective” accepted as an IEEE Micro invited article. [4]

Notes:

[1] “MACH: My Automated Conversation coach”

By Mohammed (Ehsan) Hoque, Matthieu Courgeon, Jean-Claude Martin, Bilge Mutlu, and Rosalind W. Picardhttps://www.ubicomp.org/ubicomp-iswc-2023/awards/ubicomp-iswc-2023-awards/

MACH–My Automated Conversation coacH–is a novel system that provides ubiquitous access to social skills training. The system includes a virtual agent that reads facial expressions, speech, and prosody and responds with verbal and nonverbal behaviors in real time. This paper presents an application of MACH in the context of training for job interviews. During the training, MACH asks interview questions, automatically mimics certain behavior issued by the user, and exhibit appropriate nonverbal behaviors. Following the interaction, MACH provides visual feedback on the user’s performance. The development of this application draws on data from 28 interview sessions, involving employment-seeking students and career counselors. The effectiveness of MACH was assessed through a weeklong trial with 90 MIT undergraduates. Students who interacted with MACH were rated by human experts to have improved in overall interview performance, while the ratings of students in control groups did not improve. Post-experiment interviews indicate that participants found the interview experience informative about their behaviors and expressed interest in using MACH in the future.

[2] “The CVE Wayback Machine: Measuring Coordinated Disclosure from Exploits Against 2 Years of Zero-Days”

Eric Pauley, Paul Barford and Patrick McDaniel

Proceedings of the ACM Internet Measurement Conference, 2023

https://dl.acm.org/doi/10.1145/3618257.3624810

Real-world software security depends on coordinated vulnerability disclosure (CVD) from researchers, a process that the community has continually sought to measure and improve. Yet, CVD practices are only as effective as the data that informs them, with deep and representative data collection remaining frustratingly elusive. In this paper, we leverage Kappa, a cloud-based interactive Internet telescope, to build statistical models of vulnerability lifecycles, bridging the data gap in over 20 years of CVD research. By analyzing application-layer Internet scanning traffic seen by our new vantage point over two years, we identify real-world exploitation timelines for 63 emergent threats. We bring this data together with six additional conventional datasets to build a complete birth-to-death model of these vulnerabilities, the most complete analysis of vulnerability lifecycles to date. Our statistical analysis reaches three key recommendations: (1) CVD across diverse vendors shows lower effectiveness than previously thought, (2) intrusion detection systems are underutilized to provide protection for critical vulnerabilities, and (3) existing data sources of CVD can be augmented by novel approaches to Internet measurement. In this way, our vantage point offers new opportunities to improve the CVD process, achieving a safer software ecosystem in practice.

[3] “Publicly Detectable Watermarking for Language Models”

Jaiden Fairoze, University of California, Berkeley

Sanjam Garg, University of California, Berkeley

Somesh Jha, University of Wisconsin–Madison

Saeed Mahloujifar, FAIR, Meta

Mohammad Mahmoody, University of Virginia

Mingyuan Wang, University of California, Berkeley

https://eprint.iacr.org/2023/1661

We construct the first provable watermarking scheme for language models with public detectability or verifiability: we use a private key for watermarking and a public key for watermark detection. Our protocol is the first watermarking scheme that does not embed a statistical signal in generated text. Rather, we directly embed a publicly-verifiable cryptographic signature using a form of rejection sampling. We show that our construction meets strong formal security guarantees and preserves many desirable properties found in schemes in the private-key watermarking setting. In particular, our watermarking scheme retains distortion-freeness and model agnosticity. We implement our scheme and make empirical measurements over open models in the 7B parameter range. Our experiments suggest that our watermarking scheme meets our formal claims while preserving text quality.

[4] “Fifty Years of ISCA: A Data-driven Retrospective”

Matthew D. Sinclair, Parthasarathy Ranganathan, Gaurang Upasani, Adrian Sampson, David Patterson, Rutwik Jain, Nidhi Parthasarathy, Shaan Shah

https://arxiv.org/abs/2306.03964

2023 marked the fiftieth year of the International Symposium on Computer Architecture (ISCA). As one of the oldest and preeminent computer architecture conferences, ISCA represents a microcosm of the broader community;  correspondingly, a 50-year-retrospective offers us a great way to track the impact and evolution of the field. Analyzing the content and impact of all the papers published at ISCA so far, we show how computer architecture research has been at the forefront of advances that have driven the broader computing ecosystem. Decadal trends show a dynamic and rapidly-evolving field, with diverse contributions. Examining how the most highly-cited papers achieve their popularity reveals interesting trends on technology adoption curves and the path to impact. Our data also highlights a growing and thriving community, with interesting insights on diversity and scale. We conclude with a summary of the celebratory panel held at ISCA, with observations on the exciting future ahead.

October 9, 2023

Prof Bilge Mutlu and students received support from the Google Award for Inclusion Research (AIR) Program for their work titled “Supporting Social Participation for Older Adults through Robotic Telepresence”. [1]

The CDIS Job Fair drew nearly 2000 students and nearly 50 employers (in late September when it took place). Thanks to Prof Bart Miller and the many others who helped create such a wonderful event for our students! [2]

The 2023 Rosser lecture was given by Prof Zvi Galil, Professor and former Dean at Georgia Tech. Thank you to Prof Jin-Yi Cai for hosting! [3]

Prof Matt Sinclair gave an invited talk entitled “Leveraging open source simulators to enable HW/SW co-design of next-generation HPC systems” at Exascale Simulation of Next-Generation Computing Architectures.

Notes:

[1] Prof. Mutlu and students received support from the Google Award for Inclusion Research (AIR) Program for their work titled “Supporting Social Participation for Older Adults through Robotic Telepresence.”

More information on the program: The Award for Inclusion Research Program recognizes and supports academic research in computing and technology that addresses the needs of historically marginalized groups globally. Launched in 2020, the Award for Inclusion Research (AIR) Program is an ongoing effort to support innovative research and professors working to create positive societal impact.

Link: https://research.google/outreach/air-program/

[2] https://cdis.wisc.edu/cdis-job-fair-draws-thousands/

Over 1,800 students stopped by the School of Computer, Data, and Information Sciences job fair held on September 26 to network with local and national businesses looking for talent for internships and full-time career opportunities. Students networked with 48 employers including DraftKings, Google, Epic, Fetch, Raven, and more.

Read more at the link above.

[3] https://www.cs.wisc.edu/2023/09/05/rosser-lecture-zvi-galil/

The Rosser Lecture series is back! The 2023 lecture will feature Dr. Zvi Galil, Professor and former Dean of Computer Science at Georgia Institute of Technology, speaking about ”Georgia Tech’s online MOOC-based Master’s Program and the future of higher education.” Dr. Galil will discuss the success of the online master’s in computer science at Georgia Tech (OMSCS), launched in 2014.

Read more at the link above.

[4] “Leveraging open source simulators to enable HW/SW co-design of next-generation HPC systems”

Matt Sinclair

Exascale Simulation of Next-Generation Computing Architectures

Abstract: In this talk, I discuss our approach to address these challenges and modernize simulation approaches.  More specifically, I will discuss our efforts to develop new gem5-SST based simulation & modeling techniques to enable hardware-software co-design efforts that can model and prototype future HPC-scale systems.  Since no single methodology can provide the required accuracy, detail, or scale, we believe the path forward is through “multi-fidelity simulation.” Multi-fidelity simulation combines detailed cycle-level simulators with scalable simulation to create a multi-fidelity system which scales to model future supercomputers without compromising accuracy.  Moreover, multi-fidelity simulation in time and space will provide accurate exascale performance predictions that enable design space trade-offs exploration. This unified simulation framework will integrate best in class models—the fastest with good high-level behavioral accuracy and the most detailed with cycle-level accuracy—in the same simulation instance in both time and space. This talk also discusses the current status of multi-fidelity simulation and where improvements are needed to reach these lofty goals.

October 2, 2023

  • Prof Miron Livny and colleagues ran the HTCondor European Workshop in Orsay, France. [1]
  • Prof Somesh Jha and colleagues are running a GenAI Risks workshop this October.  [2]
  • Prof Yudong Chen, Ilias Diakonikolas, Prof Jelena Diakonikolas, Prof Josiah Hanna, Prof Somesh Jha, Prof Kirthi Kandasamy, Prof Yong Jae Lee, Prof Sharon Li, Prof Yingyu Liang, Prof Christos Tzamos, Prof Jerry Zhu, our Chair Prof Steve Wright, and many colleagues published 30 papers at NeurIPS this year, including a number of Spotlight papers.  [3]

Notes:

[1] The week of September 17th, the Irene-Joliot Curie Lab, in Orsay, France hosted the annual HTCondor European Workshop.  41 attendees representing 20 institutions from 9 countries, including members of the CS Department’s Center for High Throughput Computing, met to discuss progress, challenges and future goals for deploying and operating distributed High Throughput systems using the HTCondor Software Suite. Highlights include benchmark results from the CMS experiment at CERN, showing that, if needed, they can scale their computing cluster to support over 800,000 concurrent running jobs, the European Weather Cloud’s adoption of HTCondor for their computing, and news about Pelican, a new distributed storage system intended to support distributed high throughput workloads.

[2] Here is the website for the GenAI Risks workshop this October 16 in Reston, VA:

https://sites.google.com/view/genai-risks-workshop-oct-2023

The agenda is shaping up around GenAI policy, alignment, and watermarking and detection, and will be online asap.

We’d like your help in broadening the attendance a bit. If you have students, colleagues, and collaborators interested in participating in the discussion, please forward this message with the website to them or simply point them to the RSVP page on the website:

https://docs.google.com/forms/d/e/1FAIpQLSe49qQFGH9Vs2H7uttT5n6VXjazRJyC5JxwTio63XpHcv5D-Q/viewform

We look forward to seeing you there.

Thank you,

Mihai Christodorescu (Google)

Somesh Jha (UW-Madison and Google)

Khawaja Shams (Google)

John Mitchell (Stanford)

[3] Here is the list gathered:

1 – “First Order Stochastic Optimization with Oblivious Noise”

I. Diakonikolas, S. Karmalkar, J. Park, C. Tzamos

This paper gives the first efficient method for stochastic first order optimization with noise, broadly generalizing the heavy-tailed setting.

2 – “SQ Lower Bounds for Non-Gaussian Component Analysis with Weaker Assumptions”

I. Diakonikolas, D. Kane, L. Ren, Y. Sun

This paper greatly strengthens a general methodology (developed by the first two named authors and Alistair Stewart in FOCS 2017) to establish information-computation tradeoffs for high-dimensional statistical tasks. As a corollary, it gives new such tradeoffs for a range of ML problems.

3 – “SQ Lower Bounds for Learning Mixtures of Linear Classifiers”

I. Diakonikolas, D. Kane, Y. Sun

This paper proves an information-computation tradeoff showing that known algorithms for learning mixtures of linear classifiers are essentially best possible.

4 – “Near-Optimal Algorithms for Gaussians with Huber Contamination: Mean Estimation and Linear Regression”

I. Diakonikolas, D. Kane, A. Pensia, T. Pittas

This paper revisits the classical problems of mean estimation and linear regression of a multivariate Gaussian in Huber’s contamination model. It gives optimal algorithms for these basic tasks with respect to sample size, runtime, and accuracy guarantees.

5 – “Efficient Testable Learning of Halfspaces with Adversarial Label Noise”

I. Diakonikolas, D. Kane, V. Kontonis, S. Liu, N. Zarifis

This paper gives the first efficient algorithm for learning linear separation with adversarial labe noise in the testable framework, in which we can trust the accuracy of our hypothesis independent of the assumptions.

6 – “A Spectral Algorithm for List-Decodable Covariance Estimation in Relative Frobenius Norm” [Spotlight]

I. Diakonikolas, D. Kane, J. Lee, A. Pensia, T. Pittas

This paper gives a fast and provable algorithm for list-decodable covariance estimation in the presence of a majority of outliers. It implies the first efficient algorithm for robustly learning arbitrary mixtures of Gaussians that does not require solving a large convex program.

7 – “Grounding Neural Inference with Satisfiability Modulo Theories” [Spotlight]

Matt Fredrikson, Kaiji Lu, Somesh Jha, Saranya Vijayakumar, Vijay Ganesh, Zifan Wang

This paper enhances grounding in ML models by incorporating satisfiability modulo solvers (SMT), such as Z3.

8 – “Doubly Robust Peer-To-Peer Learning Protocol”

Nicholas Franzese, Adam Dziedzic, Christopher A. Choquette-Choo, Mark R. Thomas, Muhammad Ahmad Kaleem, Stephan Rabanser, Congyu Fang, Somesh Jha, Nicolas Papernot, Xiao Wang

This paper uses techniques from secure multiparty computation (SMC) to make peer-to-peer learning (such as various federated learning algorithms) more robust to malicious participants.

9 – “Block-Coordinate Methods and Restarting for Solving Extensive-Form Games”

D. Chakrabarti, J. Diakonikolas, C. Kroer,

The paper provides a cyclic block coordinate method for solving extensive-form games. The method is the first block coordinate-type method for such problems and the results are surprising from at least two aspects: (1) that it is possible at all to get convergence results for a block coordinate-type method, as the feasible set is not block separable, and (2) it is the first cyclic method that provably does not converge slower than full vector-update methods in the worst case (all other cyclic methods are slower by a factor scaling polynomially with the number of blocks in the worst case). The paper also provides a restarting strategy that empirically speeds up the algorithm.

10 – “Near-Optimal Bounds for Learning Gaussian Halfspaces with Random Classification Noise”

I. Diakonikolas, J. Diakonikolas, D.M. Kane, P. Wang, N. Zarifis

This paper provides matching upper and lower bounds for “efficient” algorithms addressing one of the most fundamental learning problems: linear classification where the data points follow standard Gaussian distribution and the labels are determined by a biased halfspace + random classification noise. The paper provides a statistical query (SQ) lower bound, which applies to all standard approaches for addressing this problem, that shows that there is an inherent information-computation gap for this problem: all polynomial-time SQ algorithms in the worst case need more samples than suggested by the information-theoretic lower bound. A surprising aspect of the upper bound is that we obtain an optimization algorithm (a variant of Riemannian gradient descent) that obtains linear convergence (with geometrically reducing error) for a nonconvex piecewise-linear optimization problem on a sphere.

11 – “Robust Second-Order Nonconvex Optimization and Its Application to Low Rank Matrix Sensing”

S.Li, Y. Cheng, I. Diakonikolas, J. Diakonikolas, R. Ge, S. Wright,

This paper provides the first algorithm for finding second-order stationary points in  nonconvex optimization in the outlier-robust setting, and with dimension-independent error guarantees. It further demonstrates how these general results can be applied to low rank matrix sensing problems, where outliers are likely to arise and second order stationarity suffices as a criterion of convergence.

12 – “Geometry-Aware Adaptation for Pretrained Models”

Nicholas Roberts, Xintong Li, Dyah Adila, Sonia Cromp, Tzu-Heng Huang, Jitian Zhao, Frederic Sala

13 – “Mitigating Source Bias for Fairer Weak Supervision”

Changho Shin, Sonia Cromp, Dyah Adila, Frederic Sala

14 – “Promises and Pitfalls of Threshold-based Auto-labeling” [Spotlight]

Harit Vishwakarma, Heguang Lin, Frederic Sala, Ramya Vinayak

15 – “Train ‘n Trade: Foundations of Parameter Markets”

Tzu-Heng Huang, Harit Vishwakarma, Frederic Sala

16 – “Skill-it! A Data-driven Skills Framework for Understanding and Training Language Models” [Spotlight]

ayee F Chen (Stanford), Nicholas Roberts, Kush Bhatia (Stanford), Jue WANG (Together AI), Ce Zhang (Together AI), Frederic Sala, Christopher Re (Stanford)

17 – “Embroid: Unsupervised Prediction Smoothing Can Improve Few-Shot Classification”

Neel Guha (Stanford), Mayee F Chen (Stanford), Kush Bhatia (Stanford), Azalia Mirhoseini (Anthropic AI), Frederic Sala, Christopher Re (Stanford)

18 – “Visual Instruction Tuning” [Oral]

Haotian Liu*, Chunyuan Li*, Qingyang Wu, and Yong Jae Lee

(*equal contribution)

This paper introduces LLaVA (Large Language and Vision Assistant), a large multimodal model that connects a vision encoder and LLM for general-purpose visual and language understanding.

https://llava-vl.github.io/

LLaVA is able to explain that the artwork is a parody of the Mona Lisa.

19 – “Dissecting Knowledge Distillation: An Exploration of its Inner Workings and Applications”

Utkarsh Ojha*, Yuheng Li*, Anirudh Sundara Rajan*, Yingyu Liang, and Yong Jae Lee

(*equal contribution)

This paper presents a comprehensive study to try to answer the following question: what is the knowledge that gets distilled in knowledge distillation? That is, in what ways does the student become similar to the teacher? Does it start to localize objects in the same way? Does it get fooled by the same adversarial samples? Does its data invariance properties become similar?

20 – “Visual Instruction Inversion: Image Editing via Image Prompting”

Thao Nguyen, Yuheng Li, Utkarsh Ojha, and Yong Jae Lee

This paper presents a framework for inverting visual prompts (i.e., before-and-after reference images) into editing instructions for controllable text-to-image generation.

https://thaoshibe.github.io/visii/

21 – “Segment Everything Everywhere All at Once”

Xueyan Zou, Jianwei Yang, Hao Zhang, Feng Li, Linjie Li, Jianfeng Wang, Lijuan Wang, Jianfeng Gao, and Yong Jae Lee

This paper presents an interactive model for segmenting images.  The model can be prompted with visual prompts (points, marks, boxes, scribbles and image segments) and language prompts (text and audio).

https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once

22 – “Conditional Mutual Information for Disentangled Representations in Reinforcement Learning” [spotlight]

Mhairi Dunion, Trevor McInroe, Kevin Luck, Josiah Hanna, and Stefano Albrecht

23 – “State-Action Similarity-Based Representations for Off-Policy Evaluation”

Brahma Pavse, Josiah Hanna

24 – “Multi-task Representation Learning for Pure Exploration in Bilinear Bandits”

Subhojyoti Mukherjee, Qiaomin Xie, Josiah Hanna, Robert Nowak

25 – “Mechanism Design for Collaborative Normal Mean Estimation” [spotlight]

Yiding Chen, Xiaojin Zhu, Kirthevasan Kandasamy

26 – “A Graph-Theoretic Framework for Understanding Open-World Representation Learning” [spotlight]

Yiyou Sun, Zhenmei Shi, Yixuan (Sharon) Li

Open-world representation learning aims at inferring both known and novel classes in unlabeled data, by harnessing prior knowledge from a labeled set with known classes.  Despite its importance, there is a lack of theoretical foundations for this problem. This paper bridges the gap by formalizing a graph-theoretic framework tailored for the open-world setting, where the clustering can be theoretically characterized by graph factorization. Based on our graph formulation, we introduce a new algorithm called Spectral Open-world Representation Learning (SORL), and show that minimizing our loss is equivalent to performing spectral decomposition on the graph. Such equivalence allows us to derive a provable error bound on the clustering performance for both known and novel classes, and analyze rigorously when labeled data helps.

27 – “Dream the Impossible: Outlier Imagination with Diffusion Models”

Xuefeng Du, Yiyou Sun, Xiaojin (Jerry) Zhu, Yixuan (Sharon) Li

Generating photo-realistic outliers in the high dimensional pixel space has been an open challenge for the field.  To tackle the problem, this paper proposes a new framework DREAM-OOD, which enables imagining photo-realistic outliers by way of diffusion models, provided with only the in-distribution (ID) data and classes.

28 – “Learning to Augment Distributions for Out-of-distribution Detection”

Qizhou Wang, Zhen Fang, Yonggang Zhang, Feng Liu, Yixuan (Sharon) Li, Bo Han

This paper derives learning theory analyzing the distribution discrepancy between the auxiliary and the unseen real OOD data, and how it affects the open-world OOD detection performance.

29 – “Provable Guarantees for Neural Networks via Gradient Feature Learning”

Zhenmei Shi, Jenny Wei, Yingyu Liang

This paper proposes a unified analysis framework for two-layer networks trained by gradient descent, including several existing analyses as special cases. It also sheds light on several interesting network learning phenomena.

30 – “Restless Bandits with Average Reward: Breaking the Uniform Global Attractor Assumption” [spotlight]

Yige Hong, Qiaomin Xie, Yudong Chen, and Weina Wang

August 28, 2023

  • Prof Mohit Gupta and many different colleagues had six(!) papers accepted into IEEE ICCV (the International Conference on Computer Vision). [1]
  • An L&S Magazine article, entitled “I Spy”, about Prof Rahul Chatterjee’s work on helping victims of domestic abuse is available here. [2]
  • Prof Somesh Jha and colleagues (including CS affiliate Prof Kassem Fawaz from ECE, in one instance) had two papers accepted into ACM CCS ‘23, entitled “Zero-Knowledge Proofs of Training” and “Stateful Defenses for Machine Learning Models Are Not Yet Secure Against Black-box Attacks”. [3]
  • Prof Efty Sifakis had the following work (on face simulation for the movie Avatar 2), entitled “Simulation-aided face strain extraction for ML animation systems”, presented at ACM SIGGRAPH.  [4]

[1] All six to appear at IEEE ICCV

[a] SoDaCam: Software-defined Cameras via Single-Photon Imaging (*oral presentation) [Project Page]

Varun Sundar (UW), Andrei Ardelean (EPFL), Tristan Swedish (Ubicept), Claudio Brusschini (EPFL), Edoardo Charbon (EPFL), Mohit Gupta (UW)

[b] Eventful Transformers: Leveraging Temporal Redundancy in Vision Transformers [Project Page]

Matthew Dutson (UW), Yin Li (UW), Mohit Gupta (UW)

[c] Computational 3D Imaging with Position Sensors (*oral presentation) [Project Page]

Jeremy Klotz (Columbia), Mohit Gupta (UW), Aswin Sankaranarayanan (CMU)

[d] Learned Compressive Representations for Single-Photon 3D Imaging [Project Page]

Felipe Gutierrez-Barragan (UW), Fangzhou Mu (UW), Andrei Ardelean (EPFL), Atul Ingle (Portland), Claudio Bruschini (EPFL), Edoardo Charbon (EPFL), Yin Li (UW), Mohit Gupta (UW), Andreas Velten (UW)

[e] Eulerian Single-Photon Vision [Project Page]

Shantanu Gupta (UW), Mohit Gupta (UW)

[f] Panoramas from Photons [Project Page]

Sacha Jungerman (UW), Atul Ingle (Portland), Mohit Gupta (UW)

[2] “I Spy”

by Aaron Conklin

“Rahul Chatterjee and his team have created a clinic to help victims of domestic abuse whose attackers may be using technology to stalk them.”

Read the rest here.

[3] To appear in ACM CCS 2023

https://www.sigsac.org/ccs/CCS2023/

“Zero-Knowledge Proofs of Training”

Sanjam Garg (University of California, Berkeley and NTT Research)

Aarushi Goel (NTT Research)

Somesh Jha (University of Wisconsin)

Saeed Mahloujifar (Meta)

Mohammad Mahmoody (University of Virginia)

Guru-Vamsi Policharla (UC Berkeley)

Mingyuan Wang (UC Berkeley)

How can a model owner prove that they trained their model according to the correct

specification? More importantly, how can they do so while preserving the privacy of the underlying dataset and the final model? We study this problem and formulate the notion of zero-knowledge proof of training (zkPoT), which formalizes rigorous security guarantees that should be achieved by a privacy-preserving proof of training. We then design novel zkPoT protocol, that is provably secure, based on MPC-in-the-head techniques combined with ideas from the zkSNARKs literature to achieve the best of both worlds in terms of communication efficiency and computational resources needed. Importantly, our protocol is streaming friendly and does not require RAM proportional to the size of the circuit being trained and hence can be adapted to the requirements of available hardware. We implement and evaluate our protocol for training a logistic regression model and estimate that for a dataset of 1 million records with 1024 features, the online prover time is about one

hour.

“Stateful Defenses for Machine Learning Models Are Not Yet Secure Against Black-box Attacks”

Ryan Feng (University of Michigan)

Ashish Hooda (University of Wisconsin-Madison)

Neal Mangaokar (University of Michigan)

Kassem Fawaz (University of Wisconsin-Madison)

Somesh Jha (University of Wisconsin-Madison)

Atul Prakash (University of Michigan)

Recent work has proposed stateful defense models (SDMs) as a compelling strategy to defend against a black-box attacker who only has query access to the model, as is common for online machine learning platforms. Such stateful defenses aim to defend against black-box attacks by tracking the query history and detecting and rejecting queries that are “similar” and thus preventing black-box attacks from finding useful gradients and making progress towards finding adversarial attacks within a reasonable query budget. Recent SDMs (e.g., Blacklight and PIHA) have shown remarkable success in defending against state-of-the-art black-box attacks. In this paper, we show that SDMs are highly vulnerable to a new class of adaptive black-box attacks. We propose a novel adaptive black-box attack strategy called Oracle-Guided Adaptive Rejection Sampling (OGARS) that involves two stages: (1) use initial query patterns to infer key properties about an SDM’s defense; and, (2) leverage those extracted properties to design subsequent query patterns to evade the SDM’s defense while making progress towards finding adversarial inputs. OGARS is broadly applicable as an enhancement to existing black-box attacks — we show how to apply the strategy to enhance six common black-box attacks to be more effective against current class of SDMs. For example, OGARS-enhanced versions of black-box attacks improved attack success rate against recent stateful defenses from almost 0% to to almost 100% for multiple datasets within reasonable query budgets.

[4]  “Simulation-aided face strain extraction for ML animation systems”

G. Klar, S. Ward, A. Moffat, E. Sifakis and K. Museth

ACM SIGGRAPH Talks, pp. 63:1-2

https://doi.org/10.1145/3587421.3595454

Abstract: We present a volumetric, simulation-based pipeline for the automatic creation of strain-based descriptors from facial performance capture provided as surface meshes. Strain descriptors encode facial poses via length elongation/contraction ratios of curves embedded in the flesh volume. Strains are anatomically motivated, correlate strongly to muscle action, and offer excellent coverage of the pose space. Our proposed framework extracts such descriptors from surface-only performance capture data, by extrapolating this deformation into the flesh volume in a physics-based fashion that respects collisions and filters non-physical capture defects. The result of our system feeds into Machine Learning facial animation tools, as employed in Avatar: The Way of Water.

August 21, 2023

  • Cool article entitled “Complexity Theory’s 50-Year Journey to the Limits of Knowledge” in Quanta Magazine here; check out the article and the timeline below which highlights Prof Jin-Yi Cai’s contributions.  [1]
  • Prof Mohit Gupta and colleagues had the paper “Mitigating AC and DC Interference in Multi-ToF-Camera Environments” accepted at IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI).  [2]
  • The Fall Semester is about to begin, and yes, the faculty contract has already done so; this means you are being paid! These events are always worth celebrating, but particularly so when one is away on sabbatical. [3]

————-

[1] Title: Complexity Theory’s 50-Year Journey to the Limits of Knowledge

by Ben Brubaker

“How hard is it to prove that problems are hard to solve? Meta-complexity theorists have been asking questions like this for decades. A string of recent results has started to deliver answers.”

Read the rest:

https://www.quantamagazine.org/complexity-theorys-50-year-journey-to-the-limits-of-knowledge-20230817/

Cool timeline from the article:

Also, thanks Prof Aws Albarghouthi for the pointer!

[2] “Mitigating AC and DC Interference in Multi-ToF-Camera Environments”

Authors:  Jongho Lee and Mohit Gupta

Webpage: https://wisionlab.com/project/stochastictdma/

Abstract: Multi-camera interference (MCI) is an important challenge faced by continuous-wave time-of-flight (C-ToF) cameras. In the presence of other cameras, a C-ToF camera may receive light from other cameras’ sources, resulting in potentially large depth errors. We propose stochastic exposure coding (SEC), a novel approach to mitigate MCI. In SEC, the camera integration time is divided into multiple time slots. Each camera is turned on during a slot with an optimal probability to avoid interference while maintaining a high signal-to-noise ratio (SNR). The proposed approach has the following benefits. First, SEC can filter out both the AC and DC components of interfering signals effectively, which simultaneously achieves high SNR and mitigates depth errors. Second, time-slotting in SEC enables 3D imaging without saturation in the high photon flux regime. Third, the energy savings due to camera turning on during only a fraction of integration time can be utilized to amplify the source peak power, which increases the robustness of SEC to ambient light. Lastly, SEC can be implemented without modifying the C-ToF camera’s coding functions, and thus, can be used with a wide range of cameras with minimal changes. We demonstrate the performance benefits of SEC with thorough theoretical analysis, simulations and real experiments, across a wide range of imaging scenarios.

[3] The Academic Calendar is good to check out:

https://secfac.wisc.edu/academic-calendar/

Especially for new folks!

August 7, 2023

  • Prof Matt Sinclair and student and colleagues had a paper accepted at IISWC entitled “Tale of Two Cs: Computation vs. Communication Scaling for Future Transformers on Future Hardware”.  [1]
  • Prof Somesh Jha’s former student, Matt Fredrikson, was part of a recent effort that showed how GPT guardrails can be circumvented, with a NY Times article about it.  [2]
  • Prof Ming Liu and collaborators had a three-year proposal entitled “Fast and Accurate Modeling and Performance Analysis of CXL-Centric Data Center Networks” by Intel.  [3]
  • Prof Matt Sinclair and collaborators had the proposals entitled “DOE Express: Leveraging Open Source Simulators to Enable HW/SW Co-Design of Next-Generation HPC Systems” funded.  [4]
  • Prof Matt Sinclair and Prof Shivaram Venkataraman had the proposal “Collaborative Research: CSR: Medium: Fortuna: Characterizing and Harnessing Performance Variability in Accelerator-rich Clusters” funded.  [5]
  • Prof Andrea Arpaci-Dusseau gave a talk about her advisor, David Culler, at his Festschrift at UC Berkeley, and also chaired the session about his first era at Berkeley on parallel and distributed computing.  [6]

—–

[1]  Paper Name: Tale of Two Cs: Computation vs. Communication Scaling for Future Transformers on Future Hardware

Authors: Suchita Pati, Shaizeen Aga, Mahzabeen Islam, Nuwan Jayasena, Matthew D. Sinclair

Abstract: Scaling neural network models has delivered dramatic quality gains across ML problems. However, this scaling has increased the reliance on efficient distributed training techniques. Accordingly, as with other distributed computing scenarios, it is important to understand how will compute and communication scale relative to one another as models scale and hardware evolves? A careful study which answers this question can better guide the design of future systems which can efficiently train future large models.

Accordingly, this work provides a comprehensive multi-axial (algorithmic, empirical, hardware evolution) analysis of compute vs. communication (Comp-vs.-Comm) scaling for future Transformer models on future hardware. First, our algorithmic analysis shows that compute generally enjoys an edge over communication as models scale. However, since memory capacity scales slower than compute, these trends are being stressed. Next, we quantify this edge by empirically studying how Comp-vs.-Comm scales for future models on future hardware. To avoid profiling numerous Transformer models across many setups, we extract execution regions and project costs using operator models. This allows a spectrum (hundreds) of future model/hardware scenarios to be accurately studied (< 15% error) and reduces profiling costs by 2100×. Our experiments show that communication will be a significant portion (40-75%) of runtime as models and hardware evolve. Moreover, communication which is hidden by overlapped computation in today’s models often cannot be hidden in future, larger models. Overall, this work highlights the increasingly large role communication will play as models scale and we conclude with a discussion of promising techniques that stand to tackle communication and also discuss how our analysis influences their betterment.

[2] https://www.nytimes.com/2023/07/27/business/ai-chatgpt-safety-research.html

From the article: “A new report indicates that the guardrails for widely used chatbots can be thwarted, leading to an increasingly unpredictable environment for the technology.”

Underlying paper found here: https://llm-attacks.org/

[3]  Prof Ming Liu, Prof. Umit Y. Orgas (ECE), Qiaomin Xie (Industrial and System Engineering)

“Fast and Accurate Modeling and Performance Analysis of CXL-Centric Data Center Networks”

Intel, 3-year proposal

Summary: Modern data center networks consist of thousands of end nodes, such as CPUs, GPUs, and storage, connected by multi-layer networks and protocols, such as Compute Express Link (CXL). Applications running on the end nodes impose large throughput (~100 Gbps) and low latency (<10 ms) requirements on the data center networks (DCN). To satisfy these requirements, the DCN designers employ more complex network switches and topologies. At the same time, the network protocols and algorithms running on these devices, such as congestion control, grow in complexity. As a result, fast and accurate pre-silicon exploration and optimization of DCN at server, rack, and DCN scales becomes both more important and challenging. Widely accepted simulation frameworks, such as ns3, enable accurate performance evaluations. However, the accuracy comes at a steep increase in exploration time, which can severely limit the scope and fidelity of pre-silicon evaluations. Therefore, we propose fast and accurate performance analysis and optimization exploration at the server, rack, and DCN levels. We plan to achieve this by (1) performing a systematic characterization over real-world systems and applications using Intel’s Sapphire Rapids system at server-/rack-scale; (2) developing pragmatic modeling abstractions that facilitate modeling accuracy and performance (encompassing a performance model for Flex Bus and CXL layered protocol, an abstract communication substrate, and an abstract routing architecture); (3) conducting large-scale simulations by integrating the modeling techniques into existing simulation frameworks.

[4] “DOE Express: Leveraging Open Source Simulators to Enable HW/SW Co-Design of Next-Generation HPC Systems”

PIs: Oscar Hernandez (ORNL), Matt Sinclair (UW)

Co-PIs: William Godoy (ORNL), Jason Lowe-Power (UC, Davis), Bobby Bruce (UC, Davis)

Link: https://science.osti.gov/ascr/-/media/funding/pdf/Awards-Lists/2950-ASCR-EXPRESS-Awards-List.pdf

Money: $1.3M Total, $400K for UW

Shortened, Good News Friendly Abstract: This multi-institutional project aims at developing new simulation and modeling techniques based on gem5-SST to enable hardware-software co-design efforts to achieve 100 ExaFlops using 20 MW of power. Currently, state-of-the-art architecture simulators are too high level to accurately model important details, too inflexible, or too slow to provide accurate and detailed cross-stack software and hardware (SW/HW) optimizations. This project proposes a “multi-fidelity simulation” in time and space to provide accurate exascale performance predictions in time scales that enable design space trade-offs exploration. The project also aims to improve simulation efficiency by identifying key regions of interest in applications through the use of hints via parallel programming models.

[5] Title: Collaborative Research: CSR: Medium: Fortuna: Characterizing and Harnessing Performance Variability in Accelerator-rich Clusters

PIs: Shivaram Venkataraman (UW), Zhao Zhang (TACC/Rutgers)

Co-PI: Matt Sinclair (UW)

Money: $1M total, $670K for UW

Abstract: Large computing clusters, including data centers and supercomputers, are used for a variety of applications including scientific computations and machine learning. Modern compute clusters typically use specialized accelerator hardware to speed up computations. Operators of accelerator-rich clusters aim to have high resource utilization across all users of the cluster. However, these systems are often under-utilized due to performance variability across accelerators; that is, application performance can vary across accelerators even when the same application is run on the same type of accelerator. This proposal will develop Fortuna, a set of tools that can be used by cluster operators and researchers to characterize and harness variability across accelerators. First, Fortuna will use new methodologies to characterize how much performance variability exists across a wide range of accelerator hardware. Second, Fortuna will identify which applications are more likely to suffer from performance variability. Finally, Fortuna will include new scheduling mechanisms that can use variability measurements and knowledge about applications to improve utilization.

Broader impacts of the proposed research include open-source implementations of algorithms and tools, which will be applicable to many large-scale clusters and lay the groundwork for wider industry adoption. The project will also create course modules on system design principles with heterogeneous hardware and software, based on the tools developed as a part of the proposal. This will teach the next generation of students how to design hardware and software to improve utilization of future systems.

[6] https://cullerfest.cs.berkeley.edu/

In academia, a Festschrift is a book honoring a respected person, especially an academic, and presented during their lifetime. It generally takes the form of an edited volume, containing contributions from the honoree’s colleagues, former pupils, and friends. Festschriften are often titled something like Essays in Honour of… or Essays Presented to… . [Wikipedia]

This Festschrift was a little less formal, consisting of a large number of talks from former students and collaborators of David’s.

July 31, 2023

  • Prof Ming Liu and Prof Mike Swift and collaborators published “LogNIC: A High-Level Performance Model for SmartNICs” at IEEE Micro.  [1]
  • Prof Paul Barford and collaborators had two NSF proposals funded, entitled “Large Scale Analysis of Configurations and Management Practices in the Domain Name System” and “Methods for Active Measurement of the Domain Name System” with Barford as PI.  [2]
  • Prof Fred Sala and Prof Jerry Zhu created a short talk about what ChatGPT is, made for a general audience; it was distributed via “Sift and Winnow” (L&S’s monthly mailing).  [3]

———–

[1] Title: LogNIC: A High-Level Performance Model for SmartNICs

Authors: Zerui Guo, Jiaxin Lin, Yuebin Bai, Daehyeok Kim, Mike Swift, Aditya Akella, Ming Liu

Abstract: SmartNICs have become an indispensable communication fabric and computing substrate in today’s data centers and enterprise clusters, providing in-network computing capabilities for traversed packets and benefiting a range of applications across the system stack. Building an efficient SmartNIC-assisted solution is generally non-trivial and tedious as it requires programmers to understand the SmartNIC architecture, refactor application logic to match the device’s capabilities and limitations, and correlate an application execution with traffic characteristics. A high-level SmartNIC performance model can decouple the underlying SmartNIC hardware device from its offloaded software implementations and execution contexts, thereby drastically simplifying and facilitating the development process. However, prior architectural models can hardly be applied due to their ineptness in dissecting the SmartNIC-offloaded program’s complexity, capturing the nondeterministic overlapping between computation and I/O, and perceiving diverse traffic profiles.

This paper presents the LogNIC model that systematically analyzes the performance characteristics of a SmartNIC-offloaded program. Unlike conventional execution flow-based modeling, LogNIC employs a packet-centric approach that examines SmartNIC execution based on how packets traverse heterogeneous computing domains, on-/off-chip interconnects, and memory subsystems. It abstracts away the low-level device details, represents a deployed program as an execution graph, retains a handful of configurable parameters, and generates latency/throughput estimation for a given traffic profile. It further exposes a couple of extensions to handle multi-tenancy, traffic interleaving, and accelerator peculiarity. We demonstrate the LogNIC model’s capabilities using both commodity SmartNICs and an academic prototype under five application scenarios. Our evaluations show that LogNIC can estimate performance bounds, explore software optimization strategies, and provide guidelines for new hardware designs.

[2] Title:  Collaborative Research: NeTS: Medium: Large Scale Analysis of Configurations and Management Practices in the Domain Name System

Investigators:  P. Barford (PI), M. Crovella (co-PI) and J. Sommers (co-PI)

Abstract:

This project will build a broad and fundamental body of knowledge on the characteristics of configurations and management practices used in the operation of the Internet’s Domain Name System. Large public datasets will be used to analyze and statistically model the DNS contents. Graph-based analysis will be used to characterize and quantify activities performed by registrars and domain operators. Tools will be developed for the research community to support machine learning applications for the DNS, anomaly detection in the DNS, and a “DNS Workbench” enabling synthetic DNS workload generation for simulation and laboratory-based research on DNS configuration, management and performance.

This project consists of two complementary components. The scientific component will focus on empirical characterization of the DNS as a whole and in parts by developing and applying techniques from graph theory, statistics and machine learning. The engineering component will use findings from the empirical analysis to identify operational practices, to inform investigation of tools and to drive new insights about how the DNS is used in practice. The project will deliver new insights about how the DNS can be improved as well as new tools for deriving and replicating those insights.

This project will contribute a deeper understanding of what is usual, and what is unusual, with respect to the contents of the DNS. New techniques and findings from this project will lead to a more robust, manageable, and better performing DNS, which has potential to positively impact society as a whole. New materials for networking and data science courses will be developed based on our research. The students who work on the project will receive guidance and mentorship. Research results will be disseminated by publishing in respected academic conferences and workshops, and all software and data artifacts will be made available to the community.

The project webpage will be hosted at https://macro-dns.github.io The webpage will include an overview of the project, project participants, updates on latest results and publications. It will also include all software and data artifacts developed during the project. The webpage and associated repositories will be available on github on an on-going basis after the conclusion of the project.

Title:  Collaborative Research: IMR: MM-1C: Methods for Active Measurement of the Domain Name System

Investigators:  P. Barford (PI), M. Crovella (co-PI) and J. Sommers (co-PI)

Abstract: This project will investigate techniques for measuring the contents of the Domain Name System (DNS) toward the goal of producing a comprehensive census of all record types. The project will focus on developing methods for generating domain names that will be used in queries of the global DNS. Strategies for name generation that will be considered include making use of underutilized data sources and machine learning techniques including Large Language Models. New metrics and a system called EverDNS will be developed to evaluate the effectiveness of the name generation methods by sending high-speed queries to the DNS from different Internet vantage points.

The first component of this project will investigate techniques for assembling the largest-feasible collection of domain names that will be used as targets to query the global DNS. The second component will assess these techniques by developing EverDNS, which will include a prototype data collector and a centralized controller. EverDNS will build on state-of-the-art DNS measurement tools and augment their methods to capture the greatest amount of information within a limited measurement budget. This project will deliver new insights about how to assemble a comprehensive census of the DNS and new data sets for research that will be made available to the community.

The techniques created by this project will enable generation of data sets that will advance scientific understanding of the DNS. It is expected that the insights gained through this research will enable methods for ensuring a more robust, manageable, and better performing DNS. Given the critical role of DNS in the Internet, this project has the potential to positively impact society as a whole. New materials for networking and data science courses will be developed as a direct result of this project. Research results will be disseminated by publishing in respected academic conferences and workshops, and all software and data artifacts will be made available to the community.

The project webpage will be hosted at https://everdns.github.io The webpage will include an overview of the project, project participants, updates on latest results and publications. It will also include all software and data artifacts developed during the project. The webpage and associated repositories will be available on github on an on-going basis after the conclusion of the project.

[3] https://explore.wisc.edu/siftwinnow-july-2023

If you’re not exactly sure what ChatGPT is, Computer Science Professors Jerry Zhu and Fred Sala will get you up to speed in this video where they discuss the rise of this ever-evolving artificial intelligence tool that can “seemingly” answer any question you pitch to it. But what are its limitations? Our AI experts will explain.

https://www.youtube.com/watch?v=fxPoX5llT04

July 25, 2023

  • Prof Yea-Seul Kim and her student had their paper, “Making Data-Driven Articles More Accessible: An Active Preference Learning Approach to Data Fact Personalization” receive an honorable mention award at DIS 2023. Well done! [1]
  • Prof Kirthi Kandasamy and colleagues had their work, “Leveraging Reviews: Learning to Price with Buyer and Seller Uncertainty”, recognized as the Exemplary Artificial Intelligence Track paper at the ACM Conference on Economics and Computation (EC’23). Great job! [2]
  • Prof Fred Sala and colleagues had their paper “Resonant Anomaly Detection with Multiple Reference Datasets” accepted into the Journal of High Energy Physics. Wonderful! [3]
  • Prof Mohit Gupta and colleagues are organizing the IEEE ICCP (International Conference on Computational Photography) in Madison this year (July 28-30). More details here. Mohit encourages any and all who are around to attend. How cool! [4]
  • Included below is \a summary of recent success from Security folks here at UW-Madison, who had eleven papers published at USENIX Security ‘23. Well done to all! (including Profs Barford, Chatterjee, Fawaz, Jha, McDaniel, Zhao) [5]
  • Prof Tom Reps and colleagues received the 2023 CAV Award, for “the introduction of context-bounded analysis and its application to systematic testing of concurrent programs”. Amazing! Congratulations! See below for some of the cool history behind the work. [6]

————–

[1] https://dis.acm.org/2023/dis-2023-program/

 “Making Data-Driven Articles More Accessible: An Active Preference Learning Approach to Data Fact Personalization” received an honorable mention from DIS 2023.

Authors: Yanan Wang, Yea-Seul Kim

Data-driven news articles are widely used to communicate societal phenomena with concrete evidence. These articles are often accompanied by a visualization, helping readers to contextualize content. However, blind and low vision (BLV) individuals have limited access to visualizations, hindering a deep understanding of data. We explore the possibility of dynamically generating data facts (texts describing data patterns in a chart) for BLV individuals based on their preferences to aid the reading of such articles. We conduct a formative study to understand how they perceive system-generated data facts and the factors influencing their preferences. The results indicate the preferences are highly varied among individuals, and a simple preference elicitation alone induces noise. Based on the findings, we developed a method to personalize the data facts generation using an active learning approach. The evaluation studies demonstrate that our model converges effectively and provides more preferable sets of data facts than the baseline.

[2] https://ec23.sigecom.org/program/plenary-speakers/

“Leveraging Reviews: Learning to Price with Buyer and Seller Uncertainty”

Authors: Wenshuo Guo (UC Berkeley); Nika Haghtalab (UC Berkeley); Kirthevasan Kandasamy (UW Madison); Ellen Vitercik (Stanford University)

Abstract: In online marketplaces, customers have access to hundreds of reviews for a single product. Buyers often use reviews from other customers that share their type—such as height for clothing, skin type for skincare products, and location for outdoor furniture—to estimate their values, which they may not know a priori. Customers with few relevant reviews may hesitate to make a purchase except at a low price, so for the seller, there is a tension between setting high prices and ensuring that there are enough reviews so that buyers can confidently estimate their values. Simultaneously, sellers may use reviews to gauge the demand for items they wish to sell. In this work, we study this pricing problem in an online learning setting where the seller interacts with a set of buyers of finitely-many types, one-by-one, over a series of T rounds. At each round, the seller first sets a price. Then a buyer arrives and examines the reviews of the previous buyers with the same type, which reveal those buyers’ ex-post values. Based on the reviews, the buyer decides to purchase if they have good reason to believe that their ex-ante utility is positive. Crucially, the seller does not know the buyer’s type when setting the price, nor even the distribution over types. We provide a no-regret algorithm that the seller can use to obtain high revenue. When there are d types, after T rounds, our algorithm achieves a problem independent O˜(T 2/3d 1/3 ) regret bound. However, when the smallest probability qmin that any given type appears is large, specifically when qmin ∈ Ω(d −2/3T −1/3 ), then the same algorithm achieves a O˜(T 1/2 q −1/2 min ) regret bound. Our algorithm starts by setting lower prices initially, so as to (i) boost the number of reviews and increase the accuracy of future buyers’ value estimates while also (ii) allowing the seller to identify which customers need to be targeted to maximize revenue. This mimics real-world pricing dynamics. We complement these upper bounds with matching lower bounds in both regimes, showing that our algorithm is minimax optimal up to lower order terms

[3] “Resonant Anomaly Detection with Multiple Reference Datasets”

Authors: Mayee F. Chen, Benjamin Nachman, and Frederic Sala

Journal of High Energy Physics (JHEP)

Abstract: An important class of techniques for resonant anomaly detection in high energy physics builds models that can distinguish between reference and target datasets, where only the latter has appreciable signal. Such techniques, including Classification Without Labels (CWoLa) and  Simulation Assisted Likelihood-free Anomaly Detection (SALAD) rely on a single reference dataset. They cannot take advantage of commonly-available multiple datasets and thus cannot fully exploit available information. In this work, we propose generalizations of CWoLa and SALAD for settings where multiple reference datasets are available, building on weak supervision techniques. We demonstrate improved performance in a number of settings with realistic and synthetic data. As an added benefit, our generalizations enable us to provide finite-sample guarantees, improving on existing asymptotic analyses.

[4] https://iccp2023.iccp-conference.org/conference-program/

From Prof Gupta: “ICCP, over the past 15 years, has evolved into a highly inter-disciplinary venue for presenting novel imaging technologies – with presence from various communities, including imaging, optics, computer vision and ML. This year, we have a wonderful lineup of speakers (including several of our UW colleagues). If anyone is around and interested, please feel free to attend, and let me know if you have any questions.”

[5] CS at UW-Madison had 11 papers published at the top-tier conference USENIX Security 2023!  Topics in these areas are wide-ranging:

Abuse Vectors: A Framework for Conceptualizing IoT-Enabled Interpersonal Abuse

Sophie Stephenson and Majed Almansoori, University of Wisconsin–Madison; Pardis Emami-Naeini, Duke University; Danny Yuxing Huang, New York University; Rahul Chatterjee, University of Wisconsin–Madison

“It’s the Equivalent of Feeling Like You’re in Jail”: Lessons from Firsthand and Secondhand Accounts of IoT-Enabled Intimate Partner Abuse

Sophie Stephenson and Majed Almansoori, University of Wisconsin—Madison; Pardis Emami-Naeini, Duke University; Rahul Chatterjee, University of Wisconsin—Madison

Sneaky Spy Devices and Defective Detectors: The Ecosystem of Intimate Partner Surveillance with Covert Devices

Rose Ceccio and Sophie Stephenson, University of Wisconsin—Madison; Varun Chadha, Capital One; Danny Yuxing Huang, New York University; Rahul Chatterjee, University of Wisconsin—Madison

Tubes Among Us: Analog Attack on Automatic Speaker Identification

Shimaa Ahmed and Yash Wani, University of Wisconsin-Madison; Ali Shahin Shamsabadi, Alan Turing Institute; Mohammad Yaghini, University of Toronto and Vector Institute; Ilia Shumailov, Vector Institute and University of Oxford; Nicolas Papernot, University of Toronto and Vector Institute; Kassem Fawaz, University of Wisconsin-Madison

Araña: Discovering and Characterizing Password Guessing Attacks in Practice

Mazharul Islam, University of Wisconsin–Madison; Marina Sanusi Bohuk, Cornell Tech; Paul Chung, University of Wisconsin–Madison; Thomas Ristenpart, Cornell Tech; Rahul Chatterjee, University of Wisconsin–Madison

Automated Cookie Notice Analysis and Enforcement

Rishabh Khandelwal and Asmit Nayak, University of Wisconsin—Madison; Hamza Harkous, Google, Inc.; Kassem Fawaz, University of Wisconsin—Madison

The Space of Adversarial Strategies

Ryan Sheatsley, Blaine Hoak, Eric Pauley, and Patrick McDaniel, University of Wisconsin-Madison

Guarding Serverless Applications with Kalium

Deepak Sirone Jegan, University of Wisconsin-Madison; Liang Wang, Princeton University; Siddhant Bhagat, Microsoft; Michael Swift, University of Wisconsin-Madison

“If sighted people know, I should be able to know:” Privacy Perceptions of Bystanders with Visual Impairments around Camera-based Technology

Yuhang Zhao, University of Wisconsin—Madison; Yaxing Yao, University of Maryland, Baltimore County; Jiaru Fu and Nihan Zhou, University of Wisconsin—Madison

DScope: A Cloud-Native Internet Telescope

Eric Pauley, Paul Barford, and Patrick McDaniel, University of Wisconsin–Madison

Fairness Properties of Face Recognition and Obfuscation Systems

Harrison Rosenberg, University of Wisconsin–Madison; Brian Tang, University of Michigan; Kassem Fawaz and Somesh Jha, University of Wisconsin–Madison

[6] Tom Reps and his former Ph.D. student Akash Lal (now at MSR-India) are two of a group of five who received the 2023 CAV Award (http://i-cav.org/cav-award/), given annually at the CAV conference for fundamental contributions to the field of Computer-Aided Verification.

They were honored “for the introduction of context-bounded analysis and its application to systematic testing of concurrent programs.”  The other recipients are Shaz Qadeer (Meta), Madan Musuvathi (MSR), and Jakob Rehof (TU Dortmand).

The UW CS department has had one previous CAV Award recipient, Somesh Jha, who received it in 2015 for his part in the development of counterexample-guided abstraction refinement.

Backstory to the work, as shared by Tom:

In 2008, Akash and Reps, along with former student Nick Kidd and Tayssir Touili (CNRS), had developed a quite interesting, but complicated-to-explain, technique for context-bounded analysis.  Akash pointed out that he could greatly simplify the implementation task by just performing an equivalent source-to-source transformation.  He was just planning to implement the technique, but wasn’t thinking about writing a paper describing it.  Apparently, Reps said the opposite—and that became the paper for which we received the award (and journal version).  Although Reps has no memory of having insisted that he write it up, Akash said he was absolutely sure about it because their conversation was just six days before the CAV 2008 paper deadline.  Apparently, he got very little sleep during the following six days!  The moral of the story is “always listen to your advisor!”  ?

After Akash joined Microsoft Research India in 2009, he and others at Microsoft incorporated some of the ideas into an important tool for testing concurrent software, called Coyote, which was released open-source and also has had widespread adoption in the Azure components for compute, networking, storage, blockchain, and logistics, as well as Microsoft Teams.

July 3, 2023

  • Emeritus Prof David Wood was selected for the ACM SIGARCH Alan D. Berenbaum Distinguished Service Award for “his exemplary stewardship of the SIGARCH-SIGHPC transition and his decades of leadership in the SIGARCH and broader computer architecture community”.  [1]
  • Prof Christos Tzamos and Prof Ilias Diakonikolas had a paper at STOC ‘23 that somehow we missed last week in the FCRC round up (apologies!), entitled “A Strongly Polynomial Algorithm for Approximate Forster Transforms and its Application to Halfspace Learning”.  [2]
  • Prof Fred Sala was awarded a $100K seedling grant from the Army Research Lab STRONG program, for a proposal entitled “Leveling Up Human-Machine Collaborations by Leveraging Distributed Causal Insights”.  [3]
  • Prof Yudong Chen’s SIGMETRICS paper (mentioned last week), entitled “Overcoming the Long Horizon Barrier for Sample-Efficient Reinforcement Learning with Latent Low-Rank Structure” received the best student paper award from SIGMETRICS.  [4]
  • Prof Matt Sinclair participated on a panel entitled “The future of gem5” at the Gem5 workshop.  [5]
  • Prof Paris Koutris and students’ paper “Space-Time Tradeoffs for Conjunctive Queries with Access Patterns” won a PODS Distinguished paper award.  [6]
  • Prof Shiv Venkataraman and collaborators had a paper accepted to Supercomputing 2023 entitled “Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement Learning”.  [7]

——————————–

[1] https://www.sigarch.org/benefit/awards/acm-sigarch-distinguished-service-award/

The award is presented annually at the International Symposium on Computer Architecture Awards Banquet. This year’s recipient will be invited to accept the award at ISCA 2023. Recipients receive a memento engraved with their name along with a $1000 honorarium. The award recipient also receives up to $2000 towards support for travel costs, including airfare, hotel, and conference registration for ISCA. The recipient is listed with a citation for their award on the SIGARCH Alan D. Berenbaum Distinguished Service Award webpage.

Past winners include:

Kathryn S McKinley, Per Stenström, Alvin R. Lebeck (UW alum and David’s Ph.D. student), Margaret Martonosi, Koen De Bosschere, Michael Flynn, Doug DeGroot, Norman P. Jouppi, David A. Patterson, Janie Irwin, Mark D. Hill (UW Emeritus Professor), and Alan Berenbaum.

[2] “A Strongly Polynomial Algorithm for Approximate Forster Transforms and its Application to Halfspace Learning”

Ilias Diakonikolas, Christos Tzamos (UW Madison); Daniel Kane (UCSD)

STOC ‘23

https://dl.acm.org/doi/10.1145/3564246.3585191

The Forster transform is a method of regularizing a dataset by placing it in radial isotropic position while maintaining some of its essential properties. Forster transforms have played a key role in a diverse range of settings spanning computer science and functional analysis. Prior work had given weakly polynomial time algorithms for computing Forster transforms, when they exist. Our main result is the first strongly polynomial time algorithm to compute an approximate Forster transform of a given dataset or certify that no such transformation exists. By leveraging our strongly polynomial Forster algorithm, we obtain the first strongly polynomial time algorithm for distribution-free PAC learning of halfspaces. This learning result is surprising because proper PAC learning of halfspaces is equivalent to linear programming. Our learning approach extends to give a strongly polynomial halfspace learner in the presence of random classification noise and, more generally, Massart noise.

[3] https://www.arl.army.mil/cras/strong-cra/

[4] https://www.sigmetrics.org/sigmetrics2023/

[5] ISCA 2023: GEM5 WORKSHOP

Panel: The future of gem5

Panel Members: Mohammad Alian, Brad Beckmann, Gabriel Busnot, Jason Lowe-Power, Miquel Moreto, Matt Sinclair

Link: https://www.gem5.org/events/isca-2023

The gem5 Workshop is an opportunity for members of the gem5 community to give presentations on their contributions to gem5 and gem5-related research. This year’s workshop will start with a 25 minute keynote by Dr. Bobby R. Bruce, titled “gem5 v22, v23 and the future”, and will continue with presentations and discussion panels on gem5 topics.

[6] https://2023.sigmod.org/program_pods.shtml

[7] https://arxiv.org/abs/2306.14086

Mirage: Towards Low-interruption Services on Batch GPU Clusters with Reinforcement Learning

Qiyang Ding, Pengfei Zheng, Shreyas Kudari, Shivaram Venkataraman, Zhao Zhang

Accommodating long-running deep learning (DL) training and inference jobs is challenging on GPU clusters that use traditional batch schedulers, such as Slurm. Given fixed wall clock time limits, DL researchers usually need to run a sequence of batch jobs and experience long interruptions on overloaded machines. Such interruptions significantly lower the research productivity and QoS for services that are deployed in production. To mitigate the issues from interruption, we investigate a set of statistical learning and reinforcement learning (RL) techniques, including random forest, xgboost, Deep Q-Network, and policy gradient to design a proactive provisioner using production job traces from three GPU clusters. We follow the standard machine learning practice by partitioning each job trace into training and validation subsets, then train each model using the training subset and evaluate the generality using the validation subset. We introduce Mirage, a Slurm-compatible resource provisioner that integrates the candidate RL methods. Our experiments show that the Mirage can reduce the interruption by 17-100% and safeguard 23%-76% of jobs with zero interruption across varying load levels on the three clusters.

June 26, 2023

Congratulations to Prof Sharon Li for her NSF CAREER award entitled “Foundations of Human-Centered Machine Learning in the Wild”. Well done! [1]

It’s FCRC, which means a number of our colleagues are presenting their work in various conferences. Great job to all for their work! [2]

“Synthesizing Quantum-Circuit Optimizers”

PLDI ‘23

by Prof Swamit Tannu and Prof Aws Albarghouthi

“Psym: Efficient Symbolic Exploration of Distributed Systems”

PLDI ‘23by Prof Aws Albarghouthi’s post-doc Lauren Pick and colleagues

“Towards Accelerating Data Intensive Application’s Shuffle Process Using SmartNICs” SIGMETRICS ‘23

by Prof Xiangyao Yu and colleagues

“Bias and Extrapolation in Markovian Linear Stochastic Approximation with Constant Stepsizes”

SIGMETRICS ‘23

by Prof Yudong Chen and colleagues

“Overcoming the Long Horizon Barrier for Sample-Efficient Reinforcement Learning with

Latent Low-Rank Structure”

SIGMETRICS ‘23

by Prof Yudong Chen and colleagues

“Scaling Qubit Readout with Hardware Efficient Machine Learning Architectures”

ISCA ‘23

by Prof Swammit Tannu and colleagues

“Enabling High Performance Debugging for Variational Quantum Algorithms

using Compressed Sensing”

ISCA ‘23

by Prof Swammit Tannu and colleagues

“The complexity of counting planar graph homomorphisms of domain size 3”

STOC ‘23

by Prof Jin-Yi Cai and colleague

———

[1] https://www.nsf.gov/awardsearch/showAward?AWD_ID=2237037&HistoricalAwards=false

Machine learning models today must operate amid increasingly dynamic and open environments. One important characteristic of open environments is that the intelligent system will encounter new contexts and out-of-distribution data; that is, data that were not used to train the algorithms the original system. However, current supervised learning is brittle, lacking a reliable understanding of the way data evolves over time and how to respond to changing environments. This introduces a set of new challenges and drives the need for rethinking the design of machine learning algorithms, that can detect out-of-distribution data and repair the learned model under evolving data in the wild. The project will directly impact many real-world domains including autonomous transportation, healthcare, commerce, and scientific discovery.

The objective of this project is to lay new foundations for safe, adaptive, and long-term beneficial learning algorithms in open-world environments. The project has a continuum of three research aims that will fundamentally transform the way that machine learning models are trained, updated, and monitored in the wild: (1) create a new learning framework for reliable decisions, rendering strong safety against unknowns upon deploying machine learning models in the wild; (2) accelerate model adaptation, learning to classify new concepts emerging in the wild while minimizing human supervision required; (3) characterize and understand dynamics in terms of long-term accuracy and safety, maximizing the impact of models as they evolve and operate in the long run. The education plan will publicize the power of open-world machine learning through a new course, a new undergraduate mentorship program ‘Entering AI Research’ and outreach efforts.

[2] https://fcrc.acm.org/

Lots to look through there; check it out?

June 19, 2023

First, promotions. Congratulations to all of the following on their now-official promotions. In alphabetical (last name) order:
  • Congratulations to Prof Ilias Diakonikolas, for his promotion to Full Professor.
  • Congratulations to Prof Yingyu Liang, for his promotion to Associate Professor.
  • Congratulations to Prof Efty Sifakis, for his promotion to Full Professor.
Second, congratulations to the following Wisconsin Professors who have contributed to an amazing eleven papers published at COLT ’23 this year (a top conference in learning theory) [1]. Well done to:
  • Prof Ilias Diakonikolas
  • Prof Jelena Diakonikolas
  • Prof Christos Tzamos
  • Prof Dimitris Papailiopoulos (ECE)
——
The 36th Annual Conference on Learning Theory (COLT 2023) will take place July 12–15, 2023. Assuming the circumstances allow for an in-person conference it will be held in Bangalore, India. We invite submissions of papers addressing theoretical aspects of machine learning, broadly defined as a subject at the intersection of computer science, statistics and applied mathematics. We strongly support an inclusive view of learning theory, including fundamental theoretical aspects of learnability in various contexts, and theory that sheds light on empirical phenomena.

June 5, 2023

  • Emeritus Prof Jeff Naughton’s student, Joe Hellerstein (Ph.D., ‘95, Prof at UC Berkeley since then), won this year’s Edgar F. Codd Innovations Award, which is given for “innovative and highly significant contributions of enduring value to the development, understanding, or use of database systems and databases.” This is the seventh person with Wisconsin roots/connections to have won the award! (David DeWitt, Rakesh Agrawal, Michael Carey, Goetz Graefe, Raghu Ramakrishnan, Anastasia Ailamaki being the others). [1]
  • Prof Matt Sinclair gave a talk entitled “Simulating and Modeling Hardware for Machine Learning Workloads at Scale” at an SRC AIHW Annual Review. [2]
  • Prof Mike Gleicher and colleagues have a paper entitled “A Problem Space for Designing Visualizations” set to appear IEEE Computer Graphics and Applications. [3]
  • Check out this video about incoming Provost Prof Charles Isbell[4]

Notes:

[1] The SIGMOD Edgar F. Codd Innovations Award is given for innovative and highly significant contributions of enduring value to the development, understanding, or use of database systems and databases. Until 2003, this award was known as the “SIGMOD Innovations Award.” In 2004, SIGMOD, with the unanimous approval of ACM Council, decided to rename the award to honor Dr. E.F. (Ted) Codd (1923 – 2003) who invented the relational data model and was responsible for the significant development of the database field as a scientific discipline.

https://sigmod.org/sigmod-awards/sigmod-edgar-f-codd-innovations-award/

[2] Title: Simulating and Modeling Hardware for Machine Learning Workloads at Scale

Abstract: As both the size of Machine Learning (ML) and Artificial Intelligence (AI) hardware and the data they operate on scale-up, there is a growing challenge in performing early-system exploration based on sound simulation methodology. Using existing tools and infrastructure, a detailed simulation of a contemporary deep neural network can take years. In addition, the difficult task of modeling energy at scale, without working RTL for a newly proposed accelerator, or accelerator component (i.e., Tensor Cores), remains an important research challenge. In particular modeling energy in a way that is flexible and can be extended to both different types of programmable accelerators (like GPUs) and to more bespoke discrete accelerators (like TPUs) is needed.

The broad goals of this proposal are to (1) reduce detailed simulation time, particularly on ML training and inference workloads, by characterizing execution in silicon, validated with detailed simulation, to find representative program slices, (2) develop a flexible energy model for accelerators, and (3) develop infrastructure to model the multi-GPU and multi-accelerator system of the future. To tackle these problems, we propose to extend existing simulation infrastructure, such as gem5, to create a scalable, parallel simulation framework that models diverse system components with varying levels of detail. In this talk, I discuss our on-going efforts to identify the most important parts of the applications to simulate at high fidelity while reducing simulation time.

[3] https://arxiv.org/abs/2303.06257

“A Problem Space for Designing Visualizations”

Michael Gleicher, Maria Riveiro, Tatiana von Landesberger, Oliver Deussen, Remco Chang, Christina Gillman

To appear in IEEE CG&A this summer

Abstract: Visualization researchers and visualization professionals seek appropriate abstractions of visualization requirements that permit considering visualization solutions independently from specific problems. Abstractions can help us design, analyze, organize, and evaluate the things we create. The literature has many task structures (taxonomies, typologies, etc.), design spaces, and related “frameworks” that provide abstractions of the problems a visualization is meant to address. In this viewpoint, we introduce a different one, a problem space that complements existing frameworks by focusing on the needs that a visualization is meant to solve. We believe it provides a valuable conceptual tool for designing and discussing visualizations.

[4] Video: https://youtu.be/njMJ7lPIfsw

Title: “Charles Isbell”

May 22, 2023

Prof Ming Liu, Prof Mike Swift, and students had the paper “LEED: A Low-Power, Fast Persistent Key-Value Store on SmartNIC JBOFs” accepted into SIGCOMM ‘23.  [1]

Prof Mike Gleicher’s student Keaton Leppanen won a Chateaubriand Fellowship to spend a semester collaborating with colleagues in Paris [2].

Incoming Provost, and soon-to-be CS Prof Charles Isbell, won the 2023 Richard A. Tapia Achievement Award for Scientific Scholarship, Civic Science, and Diversifying Computing. [3]

Notes:

[1] “LEED: A Low-Power, Fast Persistent Key-Value Store on SmartNIC JBOFs”

Zerui Guo, Hua Zhang, Chenxingyu Zhao, Yuebin Bai, Michael Swift, Ming LiuSIGCOMM ‘23

Abstract: The recent emergence of low-power high-throughput programmable storage platforms–SmartNIC JBOF (just-a-bunch- of-flash)–motivates us to rethink the cluster architecture and system stack for energy-efficient large-scale data-intensive workloads. Unlike conventional systems that use an array of server JBOFs or embedded storage nodes, the introduction of SmartNIC JBOFs has drastically changed the cluster compute, memory, and I/O configurations. Such an extremely imbalanced architecture makes prior system design philosophies and techniques either ineffective or invalid.

This paper presents LEED, a distributed, replicated, and persistent key-value store over an array of SmartNIC JBOFs. Our key ideas to tackle the unique challenges induced by a SmartNIC JBOF are: trading excessive I/O bandwidth for scarce SmartNIC core computing cycles and memory capacity; making scheduling decisions as early as possible to streamline the request execution flow. LEED systematically revamps the software stack and proposes techniques across per-SSD, intra-JBOF, and inter-JBOF levels. Our prototyped system based on Broadcom Stingray outperforms existing solutions that use beefy server JBOFs and wimpy embedded storage nodes by 4.2X/3.8X and 17.5X/19.1X in terms of requests per Joule for 256B/1KB key-value objects.

[2] https://www.chateaubriand-fellowship.org/

The Chateaubriand Fellowship is a grant offered by the Embassy of France in the United States. It supports outstanding PhD students from U.S. institutions who wish to conduct part of their doctoral research in France for a period ranging from 4 to 8/9 months. Chateaubriand fellows are selected through a merit-based competition, with expert evaluation in France and in the United States.

[3] https://tapiaconference.cmd-it.org/about/richard-tapia-award/

CMD-IT proudly announces Dr. Charles Isbell as the 2023 recipient of the Richard A. Tapia Achievement Award for Scientific Scholarship, Civic Science, and Diversifying Computing. Dr. Isbell is currently The John P. Imlay Dean of the College of Computing at Georgia Tech. This fall, he will be joining the University of Wisconsin–Madison’s leadership team as the next Provost. Throughout his career, Dr. Isbell has contributed to broadening participation in computing, including serving as the founding Executive Director for the Constellations Center for Equity in Computing.

The Richard A. Tapia Achievement Award for Scientific Scholarship, Civic Science, and Diversifying Computing is presented to a distinguished computational scientist, computer scientist, or computer engineer who is making significant contributions to computing and civic areas, including teaching, mentoring, advising, and building and serving diverse communities. The award winner represents extraordinary leadership in computing scholarship and the mission to increase participation of groups who are underrepresented in computing.

The award presentation will take place during the awards ceremony at the CMD-IT/ACM Richard Tapia Celebration of Diversity in Computing Conference in Dallas, Texas, September 13-15, 2023.

May 15, 2023

Prof Swamit Tannu will be giving talks at the Next Steps in Quantum Computing Workshop and at Chalmers and Uppsala universities in Sweden, with talks entitled “Scaling and Sustaining Quantum Cloud Computing Paradigm” and “Scalable Quantum Computer Architecture: A Frontier for Systems and Computer Architects”. [1]

Prof Bilge Mutlu with collaborators from ECE (Kassam Fawaz – CS Affiliate) and SOHE (Heather Kirkorian) received new NSF funding, titled “SaTC: CORE: Medium: Designing Privacy-Aware Social Companion Robots.”  [2]

Prof Karu Sankaralingam and many collaborators at WiSys and across campus were awarded an NSF Regional Innovation Engine award. This is one of two awards across the state, and of 43 awards nationally, and comes from NSF’s new Directorate for Technology, Innovation and Partnerships as authorized by the “CHIPS and Science Act of 2022”.  [3]

Prof Bilge Mutlu and student Amy Koike have a new paper accepted to 2023 ACM Designing Interactive Systems, titled “Exploring the Design Space of Extra-Linguistic Expression for Robots.” [4]

The CS department held its biggest graduation ceremonies ever, celebrating the graduation of over 700 undergraduates, 75 masters students, and 10 or so Ph.D. students, over four ceremonies (two undergraduate ones were needed).

Notes:

[1] “Scaling and Sustaining Quantum Cloud Computing Paradigm”

Swamit Tannu

Venue: Next Steps in Quantum Computing Workshop https://cra.org/ccc/events/5-year-update-to-the-next-steps-in-quantum-computing-workshop/

“Scalable Quantum Computer Architecture: A Frontier for Systems and Computer Architects”

Swamit Tannu

Chalmers University (Sweden) and Uppsala University (Sweden)

Abstract: Quantum computers leverage fundamental quantum-mechanical properties of their constituent quantum bits (qubits) to gain a computational advantage for a class of problems. Today, scientists and engineers are racing to build larger, more-reliable quantum computers and demonstrate their effectiveness at evaluating increasingly complex quantum algorithms. Unfortunately, qubits are extremely sensitive to noise. Thus the ultimate goal is to build a large-scale quantum computer that can run quantum error correction to enable fault-tolerant computations to demonstrate a computational advantage for significant scientific and commercial problems. However, scaling quantum computers is challenging on several fronts, including technology, engineering, systems, and economics. Computer architects will play a crucial role in building quantum computing ecosystems.

In this talk, I will introduce a few challenges in building fundamental architectural blocks for quantum computers and strategies we can adopt to mitigate those challenges.

[2] SaTC: CORE: Medium: Designing Privacy-Aware Social Companion Robots

Bilge Mutlu, Kassam Fawaz, Heather Kirkorian

Social companion robots can interact with people in a human-like way, using speech, gestures, and physical presence. They are designed to help families with activities like meal planning, child development, and communication. However, their advanced sensing and reasoning capabilities can lead to privacy concerns, as they can collect, infer, and share personal information about their users. For instance, they can overhear private conversations as they move and act in the home. They also require access to sensitive environments, such as bedrooms, and can be designed to look like familiar objects or people, which can lead to people oversharing personal information. They might also inappropriately disclose information they learned with other home occupants or visitors. Therefore, it is crucial to design more “privacy-aware” social companion robots that can reason about when to collect and share data, and when not to, in ways that are context-appropriate and match people’s needs for privacy. This interdisciplinary project aims to create privacy-aware design principles for privacy-aware social robots to improve human-robot interactions in home environments. These principles will inform how robots are designed, used, and accepted in homes, workplaces, schools, and healthcare. Further, through outreach activities, the project will improve families’ understanding of privacy in smart environments through media coverage, workshops, and office hours. The educational activities of the project will train K-12 and college students in privacy systems, family studies, and human-computer interaction. Finally, the project team will share code, publications, prototypes, and datasets based on the work.

This project develops, implements, and evaluates a novel privacy framework for human-robot interaction in four phases. This framework models users’ privacy expectations, employs novel concepts grounded in human-robot interaction, develops new methods to help users manage their own privacy, and enables a study of privacy dynamics within families. Specifically, Phase I models users’ privacy concerns and expectations regarding social robots, developing a privacy framework to identify everyday situations in which privacy awareness is most important. Phase II instantiates the framework as a privacy controller for the robot’s actions that captures the dynamics of social privacy. At the core of the privacy controller is a customizable architecture that extracts high-level contextual factors from the robot’s environment, such as location in the house and number of people present, and generates privacy-aware actions. Phase III explores how the robot can leverage the privacy controller architecture to signal, demonstrate, or explain the robot’s privacy-aware actions to users. Through deployment studies, Phase IV assesses the impact of privacy awareness on the robot’s management of the complex privacy trade-offs involved in participation in family life. The findings of these studies will provide an empirical basis for understanding a privacy-aware social robot’s benefits, effects, and limitations on family dynamics.

[3]  Prof Sankaralingam, along with a team led by WiSys, was awarded an NSF Regional Innovation Engine award. This is one of two awards across the state, and of 43 awards nationally. Launched by NSF’s new Directorate for Technology, Innovation and Partnerships and authorized by the “CHIPS and Science Act of 2022,” the NSF Engines program uniquely harnesses the nation’s science and technology research and development enterprise and regional-level resources. NSF Engines aspire to catalyze robust partnerships to positively impact the economy within a geographic region, address societal challenges, advance national competitiveness and create local, high-wage jobs. The WiSys engine in particular is on Agtech and Sustainable agriculture driven by Artificial Intelligence, Advanced Materials, Disaster Mitigation/Prevention, Energy & Industrial Efficiency Tech, BigData and Cybersecurity. Senator Tammy Baldwin’s office has a press release on this announcement and WiSys press release has more details on the project.

https://www.baldwin.senate.gov/news/press-releases/senator-baldwin-helps-secure-funding-for-regional-innovation-engines-in-wisconsin

https://www.wisys.org/news-media/wisys-led-partnership-wins-1-million-nsf-grant-to-make-wisconsin-a-global-leader-in-sustainable-ag

[4] “Exploring the Design Space of Extra-Linguistic Expression for Robots”

Prof. Mutlu and CS graduate student Amy Koike

2023 ACM Designing Interactive Systems

We explore the new design space of extra-linguistic cues inspired from graphical tropes used in graphic novels and animation to enhance the expressiveness of social robots. To achieve this, we identified a set of cues that can be used to generate expressions, including smoke/steam/fog, water droplets, and bubbles. We prototyped devices that can generate these fluid expressions for a robot and conducted design sessions where eight designers explored the use and utility of the cues in conveying the robot’s internal states in various design scenarios. Our analysis of the 22 designs, the associated design justifications, and the interviews with designers revealed patterns in how each cue was used, how they were combined with nonverbal cues, and where the participants drew their inspiration from. These findings informed the design of an integrated module called EmoPack, which can be used to augment the expressive capabilities of any robot platform.

May 8, 2023

Prof Kirthi Kandasamy and collaborators had the paper “Leveraging Reviews: Learning to Price with Buyer and Seller Uncertainty” accepted at the ACM Conference on Economics and Computation (EC’23).  [1]

Prof Matt Sinclair and collaborators had four papers accepted into the gem5 Users’ workshop entitled “Closing the Gap: Improving the Accuracy of gem5’s GPU Models”, “Improving gem5’s GPUFS Support”, “Analyzing the Benefits of More Complex Cache Replacement Policies in Moderns GPU LLCs”, and “Improving the Speed of gem5’s GPU Regression Tests”.  [2]

Prof Bart Miller gave a Distinguished Lecture at the CISPA security research center (the largest federally funded security center in Germany) at the University of Saarbruecke entitled “Random Testing with ‘Fuzz’: 30 Years of Finding Bugs”. [3]

Prof Karu Sankaralingam and collaborators had the paper “Making Transformers Compute-lite for CPU inference” accepted at ICML. [4]

Prof Paris Koutris and collaborators published “The Fine-Grained Complexity of Boolean Conjunctive Queries and Sum-Product Problems” at ICALP ‘23. [5]

Prof Jignesh Patel (with collaboration from our partners at gener8tor) ran another successful instantiation of the NEST competition, which provides year-long training and exposure for students in entrepreneurship.  [6].

Notes:

[1] “Leveraging Reviews: Learning to Price with Buyer and Seller Uncertainty”

Authors: Wenshuo Guo (UC Berkeley); Nika Haghtalab (UC Berkeley); Kirthevasan Kandasamy (UW Madison); Ellen Vitercik (Stanford University)

Abstract: In online marketplaces, customers have access to hundreds of reviews for a single product. Buyers often use reviews from other customers that share their type—such as height for clothing, skin type for skincare products, and location for outdoor furniture—to estimate their values, which they may not know a priori. Customers with few relevant reviews may hesitate to make a purchase except at a low price, so for the seller, there is a tension between setting high prices and ensuring that there are enough reviews so that buyers can confidently estimate their values. Simultaneously, sellers may use reviews to gauge the demand for items they wish to sell. In this work, we study this pricing problem in an online learning setting where the seller interacts with a set of buyers of finitely-many types, one-by-one, over a series of T rounds. At each round, the seller first sets a price. Then a buyer arrives and examines the reviews of the previous buyers with the same type, which reveal those buyers’ ex-post values. Based on the reviews, the buyer decides to purchase if they have good reason to believe that their ex-ante utility is positive. Crucially, the seller does not know the buyer’s type when setting the price, nor even the distribution over types. We provide a no-regret algorithm that the seller can use to obtain high revenue. When there are d types, after T rounds, our algorithm achieves a problem independent O˜(T 2/3d 1/3 ) regret bound. However, when the smallest probability qmin that any given type appears is large, specifically when qmin ∈ Ω(d −2/3T −1/3 ), then the same algorithm achieves a O˜(T 1/2 q −1/2 min ) regret bound. Our algorithm starts by setting lower prices initially, so as to (i) boost the number of reviews and increase the accuracy of future buyers’ value estimates while also (ii) allowing the seller to identify which customers need to be targeted to maximize revenue. This mimics real-world pricing dynamics. We complement these upper bounds with matching lower bounds in both regimes, showing that our algorithm is minimax optimal up to lower order terms

[2] The gem5 Users’ workshop details are found here:

https://www.gem5.org/events/isca-2023

[A] “Closing the Gap: Improving the Accuracy of gem5’s GPU Models”

Authors: Vishnu Ramadas, Daniel Kouchekinia, Ndubuisi Osuji, and Matthew D. Sinclair

Abstract: In recent years, we have been enhancing and updating gem5’s GPU support, including enhanced gem5’s GPU support to enable running ML workloads. Moreover, we created, validated, and released a Docker image with the proper software and libraries needed to run AMD’s GCN3 and Vega GPU models in gem5. With this container, users can run the gem5 GPU model, as well as build the ROCm applications that they want to run in the GPU model, out of the box without needing to properly install the appropriate ROCm software and libraries. Additionally, we updated gem5 to make it easier to reproduce results, including releasing support for a number of GPU workloads in gem5-resources and enabling continuous integration testing for a variety of GPU workloads.

Current gem5 support focuses on Carrizo- and Vega-class GPUs. Unfortunately, these models do not always provide high accuracy relative to the equivalent ”real” GPUs. This leads to a mismatch in expectations: when prototyping new optimizations in gem5 users may draw the wrong conclusions about the efficacy of proposed optimizations if gem5’s GPU models do not provide high fidelity. Accordingly, to help bridge this divide, we design a series of micro-benchmarks designed expose the latencies, bandwidths, and sizes of a variety of GPU components on real GPUs. By iteratively applying fixes and improvements to gem’s GPU model, we significantly improve its fidelity relative to real AMD GPUs.

[B] “Improving gem5’s GPUFS Support”

Authors: Vishnu Ramadas, Matthew Poremba, Bradford M. Beckmann, and Matthew D. Sinclair

Abstract: With the waning of Moore’s Law and the end of Dennard’s Scaling, systems are turning towards heterogeneity, mixing conventional cores and specialized accelerators to continue scaling performance and energy efficiency. Specialized accelerators are frequently used to improve the efficiency of computations that run inefficiently on conventional, general-purpose processors. As a result, systems ranging from smartphones to data-centers, hyper-scalars, and supercomputers are increasingly using large numbers of accelerators to provide better efficiency than CPU-based solutions. However, heterogeneous systems face key challenges: changes to the underlying technology which threaten continued scaling, as well as the voracious scaling from applications, which require additional research to address. Traditionally, simulators could be used to perform early exploration for this research. However, existing simulators lack important support for these key challenges.

Detailed simulation of modern systems can take extremely long times in existing tools and infrastructure. Furthermore, prototyping optimizations at scale can also be challenging, especially for newly proposed accelerators. Although other simulators such as Accel-Sim, SCALE-Sim, and Gemmini enable some early experiments, they are limited in their ability to target a wide variety of accelerators. In comparison, gem5 has support for various CPUs, GPUs, DSPs, and many other important accelerators. However, efficiently simulating large-scale workloads on gem5’s cycle-level models requires prohibitively long times. In this work we discuss our efforts to enhance gem5’s support to make running these workloads practical while retaining accuracy.

[C] “Analyzing the Benefits of More Complex Cache Replacement Policies in Moderns GPU LLCs”

Authors: Jarvis Jia and Matthew D. Sinclair

Abstract: The gem5 simulator offers Classic and Ruby as two separate memory models for simulating on-chip caches. The Classic model, which originated from M5, is a quick and simple option that allows for easy configuration, but only supports a basic MOESI coherence protocol. On the other hand, the Ruby model, which was developed by GEMS, is a more advanced and flexible option that can accurately simulate a wider range of cache coherence protocols and features. However, choosing between the two memory system models in gem5 is challenging for researchers as each has advantages and limitations which can be inconvenient. In particular, this has led to a bifurcation of effort where prior work has added replacement policies to Classic and Ruby in parallel – duplicating effort unnecessarily and preventing users from using a desired replacement policy if it is not implemented in the desired memory model (e.g., users could only use RRIP in Classic).

Accordingly, we merged the cache replacement policies from Classic to Ruby, enabling users to use any of the replacement policies in either memory model. Gem5 currently has the capability to support 13 replacement policies, which can be used exchangeable within the Classic and Ruby cache models, including commonly used options like LRU, FIFO, PseudoLRU, and different types of RRIPs. After combining the replacement policies for the Classic and Ruby cache models, we designed and integrated (into gem5’s nightly regressions) multiple corner case tests to verify and ensure the continued correct functionality of these policies. Through these tests, we identified and fixed several bugs to ensure that the  replacement policies operate correctly. Finally, with the newly enabled and verified functionality, since there is limited information about how different replacement policies affects GPU performance, we decided to use gem5 to study these policies in a GPU context. Specifically, we study GPU LLC caches, and show how different replacement policies impact performance for both WT and WB LLC caches of various sizes.

[D] Improving the Speed of gem5’s GPU Regression Tests

Authors: James Braun and Matthew D. Sinclair

Abstract: In recent years, we have been enhancing and updating gem5’s GPU support, including enhanced gem5’s GPU support to enable running ML workloads. Moreover, we created, validated, and released a Docker image with the proper software and libraries needed to run AMD’s GCN3 and Vega GPU models in gem5. With this container, users can run the gem5 GPU model, as well as build the ROCm applications that they want to run in the GPU model, out of the box without needing to properly install the appropriate ROCm software and libraries. Additionally, we updated gem5 to make it easier to reproduce results, including releasing support for a number of GPU workloads in gem5-resources and enabling continuous integration testing for a variety of GPU workloads.

However, in an effort to provide sufficient coverage, the current testing support for GPU tests requires significant runtime both for the nightly and weekly regression tests. Currently most of these regression tests test the GPU SE mode support, since GPU FS mode support is still nascent. Unfortunately, much of this time is spent parsing input files to create arrays and other data structures that the GPU subsequently computes on. Although SE mode does not simulate the system calls needed to read these input files, nevertheless this still represents a significant overhead that increases runtime and prevents other tests (potentially providing additional coverage) from being run in that same timeframe. In an effort to address this, in the work we utilize SE mode’s avoiding modeling system calls to speed up the runtime of the GPU regression tests. Specifically, we redesign the input reading phase of these GPU tests to create and use mmap’d files for their input arrays (which SE mode completes all at once) instead of reading in the files entry by entry. In doing so, we reduce runtime by at least 41%

[3] “Random Testing with ‘Fuzz’: 30 Years of Finding Bugs”

Barton Miller

Fuzz testing has passed its 30th birthday and, in that time, has gone from a disparaged and mocked technique to one that is the foundation of many efforts in software engineering and testing. The key idea behind fuzz testing is using random input and having an extremely simple test oracle that only looks for crashes or hangs in the program. Importantly, in all our studies, all our tools, test data, and results were made public so that others could reproduce the work. In addition, we located the cause of each failure that we caused and identified the common causes of such failures.

In the last several years, there has been a huge amount of progress and new developments in fuzz testing. Hundreds of papers have been published on the subject and dozens of PhD dissertations have been produced. In this talk, I will review the progress over the last 30 years describing our simple approach – using what is now called black box generational testing – and show how it is still relevant and effective today.

In 1990, we published the results of a study of the reliability of standard UNIX application/utility programs. This study showed that by using simple (almost simplistic) random testing techniques, we could crash or hang 25-33% of these utility programs. In 1995, we repeated and significantly extended this study using the same basic techniques: subjecting programs to random input streams. This study also included X-Window applications and servers. A distressingly large number of UNIX applications still crashed with our tests. X-window applications were at least as unreliable as command-line applications. The commercial versions of UNIX fared slightly better than in 1990, but the biggest surprise was that Linux and GNU applications were significantly more reliable than the commercial versions.

In 2000, we took another stab at random testing, this time testing applications running on Microsoft Windows. Given valid random mouse and keyboard input streams, we could crash or hang 45% (NT) to 64% (Win2K) of these applications. In 2006, we continued the study, looking at both command-line and GUI-based applications on the relatively new Mac OS X operating system. While the command-line tests had a reasonable 7% failure rate, the GUI-based applications, from a variety of vendors, had a distressing 73% failure rate. Recently, we decided to revisit our basic techniques on commonly used UNIX systems. We were interested to see that these techniques were still effective and useful.

In this talk, I will discuss our testing techniques and then present the various test results in more details. These results include, in many cases, identification of the bugs and the coding practices that caused the bugs. In several cases, these bugs introduced issues relating to system security. The talk will conclude with some philosophical musings on the current state of software development. Papers on the four studies (1990, 1995, 2000, 2006, and 2020), the software and the bug reports can be found at the UW fuzz home page:

http://www.cs.wisc.edu/~bart/fuzz/

More details: https://cispa.de/dls-miller

[4] “Making Transformers Compute-lite for CPU inference”

Authors: Zhanpeng Zeng, Michael Davies, Pranav Pulijala, Karthikeyan Sankaralingam, Vikas Singh

While GPU clusters remain the de facto choice for training large deep neural network (DNN) models today, a number of reasons including ease of workflow/deployment, security and cost have led to efforts investigating whether CPUs may be viable platforms for inference in day-to-day use across many sectors of the industry. But the imbalance between the compute capabilities of GPUs and CPUs is huge, which has prompted both specialized hardware enhancements from major vendors as well as evaluating which algorithmic modifications can make deployment more CPU friendly. Motivated by these considerations, we study a module which is a workhorse within modern DNN architectures, Feed Forward Networks (FFNs) and assess to what extent it can be made compute- (or FLOP-) lite. Our formulation recasts most essential operations as a memory look-up, leveraging the trade-off between the two resources on any platform: compute and memory (since CPUs offer it in abundance). Our development involves a careful redesign of each operation in a FFN, complemented with a detailed hardware profiling of strategies that will maximize efficiency — not just on contemporary hardware but on products that will be offered in the near/medium term future. While our use case is CPU inference, our method is differentiable and can be trained in an end-to-end manner via back-propagation, which enables easy training and integration into common DNN models.

[5] “The Fine-Grained Complexity of Boolean Conjunctive Queries and Sum-Product Problems”

Authors: Austen Z. Fan, Paraschos Koutris, Hangdong Zhao

ICALP ‘23

Abstract: We study the fine-grained complexity of evaluating Boolean Conjunctive Queries and their generalization to sum-of-product problems over an arbitrary semiring. For these problems, we present a general semiring-oblivious reduction from the k-clique problem to any query structure (hypergraph). Our reduction uses the notion of embedding a graph to a hypergraph, first introduced by Marx. As a consequence of our reduction, we can show tight conditional lower bounds for many classes of hypergraphs, including cycles, Loomis-Whitney joins, some bipartite graphs, and chordal graphs. These lower bounds have a dependence on what we call the clique embedding power of a hypergraph H, which we believe is a quantity of independent interest. We show that the clique embedding power is always less than the submodular width of the hypergraph, and present a decidable algorithm for computing it.

[6] NEST Competition results:

Congratulations to the 2023 CS NEST Pitch Competition Winners!!

? 1st: Crib – $5,000 (Isaac Lee, Vinesh Janarthanan)

? 2nd: PhD Insiders – $2,500 (Jake Khoussine)

? 3rd: Stuniverse – $1,500 (Nikolai Grin, Takuzen Lu)

? WI Idea: GünGellir – $1,000 (Asude Aydagul)

We were fortunate enough to have a highly experienced judging panel providing feedback and questions to all the presenting teams. Huge thanks to them!

Dave Fuhrmann – Epic, Senior VP

Jack Koziol – UW-Madison, Entrepreneur in Residence & Infosec, Exited Founder

Adhira Sunkara – WiSys, Assistant Director

Wes Schroll – Fetch, Founder & CEO

To anyone interested in creating new technologies or launching a business – join us next semester! CS NEST is a venture creation program designed to help entrepreneurially minded students, postdocs, and recent alumni create high growth startups. CS NEST is a partnership between UW Computer Sciences, WARF, and gener8tor.

Each April we end the program with a pitch competition and the opportunity for teams to win prize money.

For more info about the program and/or competition, reach out to the CS NEST Director, Jake Pollastrini at jake.pollastrini@gener8tor.com

April 17, 2023

Prof Ming Liu had the paper “Fabric-Centric Computing” accepted into the HotOS ‘23 workshop.  [1]

Prof Yea-Seul Kim (with help from the curriculum committee and other HCI faculty) had the following course approved: COMP SCI 565: Introduction to Data Visualization. [2]

Prof Mohit Gupta (with help from the curriculum committee and other Vision faculty) had the following course approved: COMP SCI 566: Introduction to Computer Vision. [3]

Prof Suman Banerjee, Prof Mohit Gupta, and colleagues had the paper “QfaR: Location-Guided Scanning of Visual Codes from Long Distances” accepted into MobiCom ‘23.  [4]

Prof Andrea Arpaci-Dusseau and Prof Remzi Arpaci-Dusseau and colleagues had the paper “WiscSort: External Sorting For Byte-Addressable Storage” accepted into PVLDB. [5]

There was a wonderful article about Prof Miron Livny and his recent Vilas honors here. [6]

Prof Rahul Chatterjee’s (and affiliated Prof Kassem Fawaz’s) student Paul Chung wins the Goldwater Scholarship.  [7]

Notes:

[1] Title: Fabric-Centric Computing

Author: Ming Liu

Abstract:

Emerging memory fabrics and the resulting composable infrastructure have fundamentally challenged our conventional wisdom on how to build efficient rack/cluster-scale systems atop. This position paper proposes a new computing paradigm–called Fabric-Centric Computing (FCC)–that views the memory fabric as the first-class citizen to instantiate, orchestrate, and reclaim computations over composable infrastructure. We describe its design principles, report our early experiences, and discuss a new intermediate system stack proposal that harnesses the uniqueness of this cluster interconnect and realizes the version of FCC.

[2] COMP SCI 565: Introduction to Data Visualization

Created by Prof Yea-Seul Kim

Introduction to topics such as perception, cognition, communication, design, implementation, applications, tools, and evaluation. Provides a broad survey of the field and covers fundamental concepts, theory, and tools in data visualization with opportunities for hands-on activities. Gain real-world experience in designing and evaluating visualizations.

[3] COMP SCI 566: Introduction to Computer Vision

Created by Prof Mohit Gupta

Topics include image formation, feature detection, motion estimation, image mosaics, 3D shape reconstruction, and object recognition. Applications of these techniques include building 3D maps, creating virtual characters, organizing photo and video databases, human computer interaction, video surveillance, and automatic vehicle navigation. Broad overview of various computer vision and machine learning techniques and sensing and imaging technologies used in computer vision applications. This is a project-based course.

[4] QfaR: Location-Guided Scanning of Visual Codes from Long Distances

Authors: Sizhuo Ma (Snap Research), Jian Wang (Snap Research), Wenzheng Chen (U. Toronto), Suman Banerjee, Mohit Gupta, and Shree Nayar (Snap Research)

Abstract: Visual codes such as QR codes provide a low-cost and convenient communication channel between physical objects and mobile devices, but typically operate when the code and the device are in close physical proximity. We propose a system, called QfaR, which enables mobile devices to scan visual codes across long distances even where the image resolution of the visual codes is extremely low. QfaR is based on {\em location-guided code scanning}, where we utilize a crowdsourced database of physical locations of codes. Our key observation is that if the approximate location of the codes and the user is known, the space of possible codes can be dramatically pruned down. Then, even if every “single bit” from the low-resolution code cannot be recovered, QfaR can still identify the visual code from the pruned list with high probability. By applying computer vision techniques, QfaR is also robust against challenging imaging conditions, such as tilt, motion blur,\etc. Experimental results with common iOS and Android devices show that QfaR can significantly enhance distances at which codes can be scanned, \eg 3.6cm-sized codes can be scanned at a distance of 7.5 meters, and 0.5m-sized codes at about 100 meters. QfaR has many potential applications, and beyond our diverse experiments, we also conduct a simple case study on its use for efficiently scanning QR code-based badges to estimate event attendance.

[5] WiscSort: External Sorting For Byte-Addressable Storage

Vinay Banakar, Kan Wu (Google), Yuvraj Patel (Edinburgh), Kim Keeton (Google), Andrea Arpaci-Dusseau, Remzi Arpaci-Dusseau

PVLDB ‘23

Abstract: We present WiscSort, a new approach to high-performance con- current sorting for existing and future byte-addressable storage (BAS) devices. WiscSort carefully reduces writes, exploits random reads by splitting keys and values during sorting, and performs interference-aware scheduling with thread pool sizing to avoid I/O bandwidth degradation. We introduce the BRAID model which encompasses the unique characteristics of BAS devices. Many state- of-the-art sorting systems do not comply with the BRAID model and deliver sub-optimal performance, whereas WiscSort demonstrates the effectiveness of complying with BRAID. We show that WiscSort is 2-7x faster than competing approaches on a standard sort benchmark. We evaluate the effectiveness of key-value sepa- ration on different key-value sizes and compare our concurrency optimizations with various other concurrency models. Finally, we emulate generic BAS devices and show how our techniques perform well with various combinations of hardware properties.

[6] Article: Here.

Pioneering University of Wisconsin–Madison researchers Stacey J. Lee, a scholar of the education of immigrant communities, and Miron Livny, a leader in high-throughput computing, have been named Vilas Research Professors.[7] Student Paul Chung wins the Goldwater Scholarship. Link: Here.

Chung is majoring in computer sciences and data science with comprehensive honors. He began his cybersecurity research while still in high school in his native South Korea, working with Professor Chang Hoon Kim at Daegu University. Chung’s family immigrated to the U.S. to provide him with more educational and research opportunities. At UW–Madison, he has undertaken research with professors Rahul Chatterjee and Kassem Fawaz at MadS&P, the computer security and privacy research group on campus, and he has interned with the university’s Office of Cybersecurity. Additionally, in summer 2022, Chung completed a research opportunity with Dr. Hanan Hibshi at the Information Networking Institute at Carnegie Mellon University. Chung is the first author on one published manuscript and the co-author on two others, and he currently has four additional manuscripts under submission. Chung plans to pursue a doctorate in computer science. He will be visiting the Max Planck Institute in Germany this summer for additional research opportunities.

April 3, 2023

Prof Ming Liu and colleagues’ ASPLOS ‘23 paper, “A Generic Service to Provide In-Network Aggregation for Key-Value Streams”, was selected as a Distinguished Paper. Great job! [1]

Prof Kirthi Kandasamy and colleagues had a paper conditionally accepted into OSDI ‘23, entitled “Cilantro: A Framework for Performance-Aware Resource Allocation for General Objectives via Online Feedback”. Great work! [2]

Two graduate students in our department won NSF graduate fellowships: Hailey Johnson (advised by Prof Bilge Mutlu) and Ivan Hu (advised by Prof Dieter van Melkebeek). Congratulations to all of you! [3]

Prof Bilge Mutlu and colleagues had two further papers accepted into the ACM Interaction Design and Children (IDC) Conference, entitled “My Unconditional Homework Buddy:” Designing Augmented Homework Experiences for Children with a Social Learning Companion Robot” and “Family Theories in Child-Robot Interactions: Understanding Families as a Whole for Child-Robot Interaction Design”. Well done! [4]

Notes:

[1] See https://asplos-conference.org for details.

Paper title: A Generic Service to Provide In-Network Aggregation for Key-Value Streams

Authors: Yongchao He, Wenfei Wu, Yanfang Le, Ming Liu, ChonLam Lao

Abstract: Key-value stream aggregation is a common operation in distributed systems, which requires intensive computation and network resources. We propose a generic in-network aggregation service for key-value streams, ASK, to accelerate the aggregation operations in diverse distributed applications. ASK is a switch-host co-designed system, where the programmable switch provides a best-effort aggregation service, and the host runs a daemon to interact with applications. ASK makes in-depth optimization tailored to traffic characteristics, hardware restrictions, and network unreliable natures: it vectorizes multiple key-value tuples’ aggregation of one packet in one switch pipeline pass, which improves the per-host’s goodput; it develops a lightweight reliability mechanism for key-value stream’s asynchronous aggregation, which guarantees computation correctness; it designs a hot-key agnostic prioritization for key-skewed workloads, which improves the switch memory utilization. We prototype ASK and use it to support Spark and BytePS. The evaluation shows that ASK could accelerate pure key-value aggregation tasks by up to 155 times and big data jobs by 3-5 times, and be backward compatible with existing INA-empowered distributed training solutions with the same speedup.

[2] Cilantro: A Framework for Performance-Aware Resource Allocation for General Objectives via Online Feedback

Authors: R. Bhardwaj*, K. Kandasamy*, A. Biswal, W. Guo, B. Hindman, J. Gonzalez, M. Jordan, I. Stoica

Abstract: Traditional systems for allocating finite cluster resources among competing jobs have either aimed at providing fairness, relied on users to specify their resource requirements, or have estimated these requirements via surrogate metrics (e.g. CPU utilization). These approaches do not account for a job’s real world performance (e.g. P95 latency). Existing performance-aware systems use offline profiled data and/or are designed for specific allocation objectives. In this work, we argue that resource allocation systems should directly account for real-world performance and the varied allocation objectives of users. In this pursuit, we build Cilantro.

At the core of Cilantro is an online learning mechanism which forms feedback loops with the jobs to estimate the resource to performance mappings and load shifts. This relieves users from the onerous task of job profiling and collects reliable real-time feedback. This is then used to achieve a variety of user-specified scheduling objectives. Cilantro handles the uncertainty in the learned models by adapting the underlying policy to work with confidence bounds. We demonstrate this in two settings. First, in a multi-tenant 1000-CPU cluster with 20 independent jobs, three of Cilantro’s policies outperform 9 other baselines on three different performance-aware scheduling objectives, improving user utilities by up to 1.2 − 3.7x. Second, in a microservices setting, where 160 CPUs must be distributed between 19 inter-dependent microservices, Cilantro outperforms 3 other baselines, reducing the end-to-end P99 latency to x0.57 the next best baseline.

[3] https://www.nsfgrfp.org/

The NSF Graduate Research Fellowship Program (GRFP) helps ensure the vitality of the human resource base of science and engineering in the United States and reinforces its diversity. The program recognizes and supports outstanding graduate students in NSF-supported science, technology, engineering, and mathematics disciplines who are pursuing research-based master’s and doctoral degrees at accredited United States institutions.

As the oldest graduate fellowship of its kind, the GRFP has a long history of selecting recipients who achieve high levels of success in their future academic and professional careers. The reputation of the GRFP follows recipients and often helps them become life-long leaders that contribute significantly to both scientific innovation and teaching. Past fellows include numerous Nobel Prize winners, former U.S. Secretary of Energy, Steven Chu, Google founder, Sergey Brin and Freakonomics co-author, Steven Levitt.

Fellows share in the prestige and opportunities that become available when they are selected. Fellowships provide the student with a three-year annual stipend of $37,000 along with a $12,000 cost of education allowance for tuition and fees (paid to the institution), as well as access to opportunities for professional development available to NSF-supported graduate students. Fellowships may only be used for an eligible graduate degree program at an academic institution accredited in, and having a campus located in, the US, its territories, possessions, or the Commonwealth of Puerto Rico.

NSF Fellows are anticipated to become knowledge experts who can contribute significantly to research, teaching, and innovations in science and engineering. These individuals are crucial to maintaining and advancing the nation’s technological infrastructure and national security as well as contributing to the economic well-being of society at large.

So that the nation can build fully upon the strength and creativity of a diverse society, the Foundation welcomes applications from all qualified individuals. Women, under-represented minorities and people with disabilities are encouraged to apply.

The fellowship is competitive, and those planning to apply should devote a sincere effort to their application. See the Applicants section for more information.

[4] Bengisu Cagiltay, Bilge Mutlu, Joseph Michaelis, “My Unconditional Homework Buddy: Designing Augmented Homework Experiences for Children with a Social Learning Companion Robot”

We aim to design robotic educational support systems that can promote socially and intellectually meaningful learning experiences for students while they complete school work outside of class. To pursue this goal, we conducted participatory design studies with 10 children (aged 10–12) to explore their design needs for robot-assisted homework. We investigated children’s current ways of doing homework, the type of support they receive while doing homework, and co-designed the speech and expressiveness of a homework companion robot. Children and parents attending our design sessions explained that an emotionally expressive social robot as a homework aid can support students’ motivation and engagement, as well as their affective state. Children primarily perceived the robot as a dedicated assistant at home, capable of forming meaningful friendships, or a shared classroom learning resource. We present key design recommendations for augmenting students’ homework experiences with a learning companion robot to support their educational experiences.

Bengisu Cagiltay, Bilge Mutlu, Margaret Kerr, “Family Theories in Child-Robot Interactions: Understanding Families as a Whole for Child-Robot Interaction Design”

In this work we discuss a theoretically motivated family-centered design approach for child-robot interactions, adapted by Family Systems Theory (FST) and Family Ecological Model (FEM). Long-term engagement and acceptance of robots in the home is influenced by factors that surround the child and family, such as child-sibling-parent relationships, family routines, rituals, and values. A family-centered approach to interaction design is essential when developing in-home technology for children, especially for social agents like robots that can form relationships and connections with them. We review related literature in family theories and connect it with child-robot interaction and child-computer interaction research. We present two case studies that exemplify how family theories, FST and FEM, can inform how robots might be integrated into homes when applied to research in child-robot and family-robot interaction design. Finally, we pose five overarching recommendations for a family centered design approach in child-robot interactions.

March 27, 2023

Prof Ming Liu and collaborators had their paper “ZNS: An Elastic Zoned Namespace for Commodity ZNS SSDs” accepted into OSDI ‘23. Awesome! [1]

Prof Bilge Mutlu and collaborators had the paper “Designing Parent-child-robot Interactions to Facilitate In-Home Parental Math Talk with Young Children” accepted into IDC ‘23. Well done! [2]

Prof Mohit Gupta and collaborators had the paper “Seeing Photons in Color” accepted into SIGGRAPH ‘23. (link to video) Great work! [3]

Prof Paris Koutris and collaborators had the paper “Space-Time Tradeoffs for Conjunctive Queries with Access Patterns” accepted into PODS ‘23. Fantastic! [4]

Prof Matt Sinclair was selected as an ARLIS faculty mentor for this summer. Cool! [5]

Prof Jerry Zhu captured a lovely picture of the Aurora Borealis display last week. We include it here because of how cool it was.

Notes:

[1] ZNS: An Elastic Zoned Namespace for Commodity ZNS SSDs

Authors: Jaehong Min, Chenxingyu Zhao, Ming Liu, Arvind Krishnamurthy

Emerging Zoned Namespace (ZNS) SSDs, providing the coarse-grained zone abstraction, hold the potential to significantly boost the cost-efficiency of future storage infrastructure and mitigate performance unpredictability. However, existing ZNS SSDs have a static zoned interface, making them in-adaptable to workload runtime behavior, unscalable to underlying hardware capabilities, and interfering with co-located zones. Applications either under-provision the zone resources yielding unsatisfied throughput, create over-provisioned zones that squander the cost, or experience unexpected I/O latencies.

We propose eZNS, an elastic zoned namespace interface that exposes an identical zone with predictable characteristics. eZNS comprises two major components: a zone arbiter that manages zone allocation and active resources on the control- plane, a hierarchical I/O scheduler with read congestion control and write admission control on the data-plane. Together, eZNS enables the transparent use of a ZNS SSD and closes the gap between application requirements and zone interface properties. Our evaluations over RocksDB demonstrate that eZNS outperforms a static zoned interface by 84.5% and 52.1% in throughput and tail latency, respectively, at most.

[2] Hui-Ro Hi, Nathan White, Edward Hubbard, Bilge Mutlu, “Designing Parent-child-robot Interactions to Facilitate In-Home Parental Math Talk with Young Children”

 ACM Interaction Design and Children (IDC) Conference

Parent-child interaction is critical for child development, yet parents may need guidance in some aspects of their engagement with their children. Current research on educational math robots focuses on child-robot interactions but falls short of including the parents and integrating the critical role they play in children’s learning. We explore how educational robots can be designed to facilitate parent-child conversations, focusing on math talk, a predictor of later math ability in children. We prototyped capabilities for a social robot to support math talk via reading and play activities and conducted an exploratory Wizard-of-Oz in-home study for parent-child interactions facilitated by a robot. Our findings yield insights into how parents were inspired by the robot’s prompts, their desired interaction styles and methods for the robot, and how they wanted to include the robot in the activities, leading to guidelines for the design of parent-child-robot interaction in educational contexts.

[3] Seeing Photons in Color

Sizhuo Ma, Varun Sundar, Paul Mos, Claudio Bruschini, Edoardo Charbon, and Mohit Gupta

SIGGRAPH ‘23

Abstract: Megapixel single-photon avalanche diode (SPAD) arrays have been developed recently, opening up the possibility of deploying SPADs as passive, general-purpose cameras for computer vision and photography. However, most previous works on SPADs have focused on monochrome imaging. We propose a computational photography approach that reconstructs high-quality color images from mosaicked binary image sequences captured by a SPAD array, even for high-dynamic-range scenes with complex and rapid motion. Inspired by conventional burst photography techniques, we design algorithms that jointly denoise and demosaick quanta image sequences. Based on the observation that motion effectively increases the color sample rate, we design a blue-noise pseudorandom RGBW color filter array for SPADs, which is tailored for imaging dark, dynamic scenes. Results on simulated data, as well as real data captured with a fabricated color SPAD hardware prototype shows that the proposed method can reconstruct high-quality images with minimal color artifacts even for challenging low-light, high-dynamic-range and fast-moving scenes. We hope this paper, by adding color to computational single-photon imaging, spurs rapid adoption of SPADs for real-world passive imaging applications.

[4] “Space-Time Tradeoffs for Conjunctive Queries with Access Patterns”

Hangdong Zhao, Shaleen Deep, Paris KoutrisPODS ‘23

Abstract: In this paper, we investigate space-time tradeoffs for answering conjunctive queries with access patterns (CQAPs). The goal is to create a space-efficient data structure in an initial preprocessing phase and use it for answering (multiple) queries in an online phase. Previous work has developed data structures that trade-off space usage for answering time

for queries of practical interest, such as the path and triangle query. However, most of these results cater to only those queries, lack a comprehensive framework, and are not generalizable.

Our main contribution is a general algorithmic framework for obtaining space-time tradeoffs for any CQAP. Our framework builds upon the PANDA algorithm and tree decomposition techniques. We demonstrate that our framework captures all state-of-the-art tradeoffs that were independently produced for various queries. Further, we falsify two conjectures proposed in the existing literature that relates to the space-time lower bound for path queries and triangle detection by proposing improved (and surprising) tradeoffs.

[5] See https://www.arlis.umd.edu/risc_faculty_mentors

The Applied Research Laboratory for Intelligence and Security (ARLIS) at the University of Maryland, College Park, is seeking faculty mentors to provide technical support for its summer internship program, Research for Intelligence & Security Challenges (RISC). This exciting 10-week paid program will pair students with mentors from UMD, INSURE consortium members, and other top universities. Those teams work in collaboration with sponsors from the Department of Defense (DOD) and Intelligence Community (IC) communities. Participating students are considered for security clearance sponsorship and then have the potential opportunity to be considered for continued work through the academic year or future employment with ARLIS or the US government.

March 20, 2023

Prof Miron Livny, as you have heard, was selected as a Vilas Research Professor, one of campus’s highest honors.  [1]

Prof Swamit Tannu and collaborators have two papers accepted for appearance at ISCA ‘23 (the 50th ISCA!). Titles: “Enabling High-Performance Debugging for Variational Quantum Algorithms using Compressed Sensing” and “ Scaling Qubit Readout with Hardware-Efficient Machine Learning Architectures”. [2]

Prof Bilge Mutlu and students won the “Best Systems Paper Award” at HRI ‘23 for their work on “Lively: Enabling Multimodal, Lifelike, and Extensible Real-time Robot Motion”.  [3]

Prof Yong Jae Lee received a Sony Focused Research Award for the proposal “Controllable High-Quality Image Generation using Diffusion Models”.  [4]

Prof Suman Banerjee and Prof Mohit Gupta had their paper “When Two Cameras Are a Crowd: Understanding and Handling Interference Across Multiple Active Cameras” accepted into Communications of the ACM as a review article.  [5]

Notes

[1] Per Amos: The Vilas Research Professorship (VRP) is the most prestigious endowed chair on our Campus.  There are 20 VRP chairs allotted to Campus by the Vilas board of trustees. Traditionally, the holders of the distinction have it for life (i.e., until retirement). So, a Campus-wide competition for a new VRP happens if and when one of the current holders retires. A necessary condition for becoming a VRP holder is to be one of the best known UW faculty internationally, and to be one of the most impactful UW faculty. In the last 35 years,  members of our department won a VRP exactly twice (Sohi, now Livny).

https://rsp.wisc.edu/Vilas/index.cfm

[2] “Enabling High-Performance Debugging for Variational Quantum Algorithms using Compressed Sensing”

Authors: Tianyi Hao, Kun Liu, Swamit Tannu

Abstract: Variational quantum algorithms (VQAs) have the potential to solve practical problems using contemporary Noisy Intermediate Scale Quantum (NISQ) computers in the near term. VQAs find near-optimal solutions in the presence of qubit errors by classically optimizing a loss function computed by parameterized quantum circuits. However, developing and testing VQAs is challenging due to the limited availability of quantum hardware, their high error rates, and the significant overhead of classical simulations. Furthermore, VQA researchers must pick the right initialization for circuit parameters, utilize suitable classical optimizer configurations, and deploy appropriate error mitigation methods. Unfortunately, these tasks are done in an ad-hoc manner today, as there are no software tools to configure and tune the VQA hyperparameters.

In this paper, we present OSCAR (cOmpressed Sensing based Cost lAndscape Reconstruction) to help VQA researchers configure: 1) correct initialization, 2) noise mitigation techniques, and 3) classical optimizers to maximize the quality of the solution on NISQ hardware. OSCAR enables efficient debugging and performance tuning by providing users with the loss function landscape without running thousands of quantum circuits as required by the grid search. Using the compressed sensing technique, we can accurately reconstruct the complete cost function landscape with a 20X to 100X speedup. Furthermore, OSCAR can compute an optimizer function query in an instant by interpolating a computed landscape, thus enabling the trial run of a VQA configuration with considerably reduced overhead.

“Scaling Qubit Readout with Hardware-Efficient Machine Learning Architectures”

Authors: Satvik Maurya, Chaithanya Naik, William Oliver, Benjamin Lienhard, Swamit Tannu

Abstract: Reading a qubit is a fundamental operation in quantum computing. It translates quantum information into classical information enabling subsequent classification to assign the qubit to the `0′ or `1′ state.

Unfortunately, qubit readout is one of the most error-prone and slowest operations on a superconducting quantum processor. On state-of-the-art superconducting quantum processors, readout errors can range from 1-10\%. These errors occur for various reasons — crosstalk, spontaneous state transitions, and excitation caused by the readout pulse. The error-prone nature of readout has resulted in significant research to design better discriminators to achieve higher qubit-readout accuracies. The readout accuracy impacts the benchmark fidelity for Noisy Intermediate Scale Quantum (NISQ) applications or the logical error rate in error-correcting codes such as the surface code.

Prior works have used machine-learning-assisted single-shot qubit-state classification, where a deep neural network was used for more robust discrimination by compensating for crosstalk errors. However, the neural network size can limit the scalability of systems, especially if fast hardware discrimination is required. This state-of-the-art baseline design cannot be implemented on off-the-shelf FPGAs or Cryogenic ASICs used for the control and readout of superconducting qubits in most systems, which increases the overall readout latency since discrimination has to be performed in software. In this work, we propose HERQULES, a scalable approach to improving qubit-state inference using matched filters in conjunction with a significantly smaller and scalable neural network for qubit-state discrimination. We achieve substantially higher readout accuracies (42\% relative improvement) than the baseline with a scalable design that can be readily implemented on off-the-shelf FPGAs. We also show that HERQULES is more versatile and can support shorter readout durations than the baseline design without additional training overheads.

[3]  “Lively: Enabling Multimodal, Lifelike, and Extensible Real-time Robot Motion”

Andrew Schoen, Dakota Sullivan, Ze Dong Zhang, Daniel Rakita, Bilge Mutlu

ACM/IEEE International Conference on Human-Robot Interaction (HRI 2023)

Robots designed to interact with people in collaborative or social scenarios must move in ways that are consistent with the robot’s task and communication goals. However, combining these goals in a naïve manner can result in mutually exclusive solutions, or infeasible or problematic states and actions. In this paper, we present Lively, a framework which supports configurable, real-time, task-based and communicative or socially-expressive motion for collaborative and social robotics across multiple levels of programmatic accessibility. Lively supports a wide range of control methods (ie, position, orientation, and joint-space goals), and balances them with complex procedural behaviors for natural, lifelike motion that are effective in collaborative and social contexts. We discuss the design of three levels of programmatic accessibility of Lively, including a graphical user interface for visual design called LivelyStudio, the core library Lively for full access to its capabilities for developers, and an extensible architecture for greater customizability and capability.

Photo of the lead authors, CS grad students Andy Schoen and Dakota Sullivan, who just presented the paper in Stockholm:

CS grad students Andy Schoen and Dakota Sullivan

[4] “Controllable High-Quality Image Generation using Diffusion Models”

Yong Jae Lee

Sony Focused Research Award

https://www.sony.com/en/SonyInfo/research-award-program/#FocusedResearchAward

Abstract: This project proposes novel diffusion-based image generation algorithms for controllable visual content creation. While text-to-image diffusion models have shown astounding performance in the past couple of years, existing models lack precise control over the generated content. In this project, we will equip the models with the ability to control the location, size, and category of the generated objects in complex scenes. We will achieve this by conditioning the diffusion model on an input scene layout that provides the bounding box and category information of each object instance in the desired generated image. The proposed algorithm could be useful for content creators who want to generate realistic images by providing both a text prompt describing the scene as well as the objects’ localization information.

[5]  “When Two Cameras Are a Crowd: Understanding and Handling Interference Across Multiple Active Cameras”

Jongho Lee, Mohit Gupta, Bhuvana Krishnaswamy, Suman Banerjee

Project webpage: https://wisionlab.com/project/stochastictdma/

Abstract: Vision and robotics systems enabled by cameras which recover 3D scene geometry are revolutionizing several aspects of our lives via technologies such as autonomous transportation, robotic surgery, and `hands-free’ user interfaces. Modern 3D cameras are active devices, where a programmable light source emits coded illumination. The emitted light gets reflected from the scene, and is received by a sensor to infer the 3D structure of the surroundings. In a multi-camera environment, such active 3D cameras may receive light from the sources of other cameras, resulting in large depth errors. This problem is becoming increasingly important due to the emergence of low-cost and compact active 3D cameras, which are becoming ubiquitous across a wide range of applications, from consumer devices to vehicular vision systems. We observe that the multi-camera interference (MCI) problem shares several similarities and dissimilarities with the common interference problems in the RF domain. Based on this observation, we describe new and emerging challenges when multiple active 3D cameras operate in the same spatio-temporal region. We also outline some solution approaches and more importantly, highlight the next steps ahead.

March 6, 2023

Prof Bart Miller and assorted folks – including stellar work from Gigi Mitchell – ran another successful Job Fair over at Union South on Feb 20th. Over 400 students, over 20 companies. Well done! [1]

Prof Shivaram Venkataraman’s student Jason Mahoney was selected as an Apple AI/ML Scholar. Awesome! [2]

Prof Patrick McDaniel and students had their paper, “Understanding the Ethical Frameworks of Internet Measurement Studies”, win a best paper award at the EthiCS workshop at NDSS ‘23. Great job! [3]

Prof Jin-Yi Cai and students had the paper “A Complexity Trichotomy for k-Regular Asymmetric Spin Systems Using Number Theory” accepted into Computational Complexity. Wonderful news! [4]

Notes:

[1]  On February 20th, we held the spring CDIS Job Fair in Union South. Over 400 UW students from across CDIS attended the fair, meeting with more than 20 companies. In addition, students were invited to attend from smaller colleges in our region, including Beloit College, Edgewood College, UW Whitewater, and Carthage College. Varsity Hall was busy from the 10am opening until the 5pm closing. Usually, the spring job fair is much smaller, but there seemed to be increased interest this year. Congrats to Gigi, who does such an amazing job organizing the event.

[2] https://machinelearning.apple.com/updates/apple-scholars-aiml-2023

Apple is proud to announce the 2023 recipients of the Apple Scholars in AI/ML PhD fellowship.

We are committed to supporting the academic research community by amplifying emerging leaders in their field and their cutting-edge machine learning research.

The Apple Scholars in AI/ML PhD fellowship recognizes the contributions of researchers in computer science and engineering at the graduate and postgraduate level. Each Scholar receives funding as they pursue their PhD, internship opportunities, and mentorship with an Apple researcher in their field. Apple Scholars in AI/ML are selected based on their innovative research, record as thought leaders and collaborators, and commitment to advancing their respective fields.

[3] Eric Pauley and Patrick McDaniel, “Understanding the Ethical Frameworks of Internet Measurement Studies”, The 2nd International Workshop on Ethics in Computer Security (EthiCS 2023) hosted at NDSS, February, 2023. San Diego, CA.

https://www.ndss-symposium.org/wp-content/uploads/2023/02/ethics2023-239547-paper.pdf

Workshop: https://ethics-workshop.github.io/2023/

Abstract: Measurement of network data received from or transmitted over the public Internet has yielded a myriad of insights towards improving the security and privacy of deployed services. Yet, the collection and analysis of this data necessarily involves the processing of data that could impact human subjects, and anonymization often destroys the very phenomena under study. As a result, Internet measurement faces the unique challenge of studying data from human subjects who could not conceivably consent to its collection, and yet the measurement community has tacitly concluded that such measurement is beneficial and even necessary for its positive impacts. We are thus at an impasse: academics and practitioners routinely collect and analyze sensitive user data, and yet there exists no cohesive set of ethical norms for the community that justifies these studies. In this work, we examine the ethical considerations of Internet traffic measurement and analysis, analyzing the ethical considerations and remediations in prior works and general trends in the community. We further analyze ethical expectations in calls-for-papers, finding a general lack of cohesion across venues. Through our analysis and recommendations, we hope to inform future studies and venue expectations towards maintaining positive impact while respecting and protecting end users.

[4] “A Complexity Trichotomy for k-Regular Asymmetric Spin Systems Using Number Theory” by Jin-Yi Cai, Zhiguo Fu, Kurt Girstmair, Michael Kowalczyk. Accepted into Computational Complexity.

Suppose φ and ψ are two angles satisfying tan(φ)=2 tan(ψ)>0. We prove that under this condition φ and ψ cannot be both rational multiples of π. We use this number theoretic result to prove a classification of the computational complexity of spin systems on k-regular graphs with general (not necessarily symmetric) real valued edge weights. We establish explicit criteria, according to which the partition functions of all such systems are classified into three classes: (1) Polynomial time computable, (2) #P-hard in general but polynomial time computable on planar graphs, and (3) #P-hard on planar graphs. In particular, problems in (2) are precisely those that can be transformed by a holo-graphic reduction to a form solvable by the Fisher-Kasteleyn-Temperley algorithm for counting perfect matchings in a planar graph.

February 27, 2023

Prof Dieter van Melkebeek once again led many Wisconsin student teams to great success in this year’s International Collegiate Programming Contest (ICPC), including placing 1st, 5th, 13th, 14th, 26th, and 29th among the 116 teams present. Amazing! [1]

Prof Mohit Gupta and collaborators had the paper “CASPI: Collaborative Photon Processing for Active Single-Photon Imaging” accepted into Nature Communications. Well done! [2]

Prof Paul Barford and collaborators had the paper “PowerPing: Measuring the Impact of Power Outages on Internet Hosts in the US” accepted into the IFIP International Conference on Critical Infrastructure Protection. Great work! [3]

Prof Matt Sinclair is co-hosting/running a gem5 tutorial co-located with HPCA. Great! [4]

And last, but certainly not least, Prof Line Roald (ECE) and Prof Shivaram Venkataraman (CS) collaborated on a new, long term, joint project known as “parenthood”. It began with the introduction of Arjun Roald-Raman, born Friday afternoon on February 24th. A warm congratulations to all three of them as they embark on this new life adventure!

Notes:

[1] Yesterday was the regional round of this year’s International Collegiate Programming Contest (ICPC). UW-Madison participated with 6 full teams of 3 students each, and one 2-student team. We had a very new cohort this year – 15 of the 17 team members other than our top team were first-time participants. Our full teams placed 1st, 5th, 13th, 14th, 26th, and 29th among the 116 teams in our region. See the scoreboard for more details.

Some statistics – This is the 4th year in a row that UW-Madison places first in the regional competition, and the 14th year in the 22 years since we started participating in 2001. The top team advances to the North America Championship in May, from where about 15 North American teams advance to the world finals. UW-Madison has advanced to the world finals every year since 2001, a title no other school in North America can claim. I have good hopes that we will be able to extend that stretch by one more year!

Dieter would like to thank the people at Epic for running a regional site on their campus, and to Nitit Jongsawatsataporn and Mingrui Liu (two of our world finalists from last year) for help coaching our teams this year.

The UW-Madison teams that participated in the regional competition:

  • – debug with cout (1st place): Joseph Cai, Ivan Hu, Boying Li

  • NonAutomatic Automaton (5th place): Anton Cheng, Leos Nguyen, Kiet Pham

  • Ain’t No Way (13th place): Pranav Pulijala, Pranav Pullabhotla, Raymond Zhao

  • Spaghetti Sort (14th place): Huaiyuan Jing, Abhay Punjabi, Lucas Scharenbroch

  • GEKOta (26th place): Ruixuan Tu, Jeffrey Wang, Boyuan Zou

  • CrazyThursdayVme_50 (29th place): Zhilin Du, Jonathan Yang, Hongtao Zhang

  • Least Contrived Implementation: Pramod Anandarao, Chen Li, (third member had to drop out two days before the contest)

[2] CASPI: Collaborative Photon Processing for Active Single-Photon Imaging

Jongho Lee, Atul Ingle, Jenu Chacko, Kevin Eliceiri, and Mohit Gupta

Abstract: Image sensors capable of capturing individual photons have made tremendous technological progress in recent years. Single-photon cameras, which once used to be a niche technology with very few pixels, have now achieved kilo-to-megapixel resolution and have found their way into commercial devices such as smartphones. However, this technology faces a major limitation—because they capture scene information at the individual photon level, the raw data is extremely sparse and noisy. Unlike conventional CMOS image sensors that have mature image processing pipelines implemented on-chip, single-photon cameras currently do not have any standard well-agreed-upon processing pipelines. Here we propose CASPI: Collaborative Photon Processing for Active Single-Photon Imaging. CASPI is a technology-agnostic, application-agnostic, training-free, and blind photon processing pipeline for emerging high-resolution single-photon cameras. CASPI exploits the abundant spatio-temporal correlations that are present in the raw photon data cubes captured by a single-photon camera. Our key observation is that if both local and non-local correlations in the cubes are collaboratively exploited, we can estimate scene properties reliably even under extreme illumination conditions. CASPI is versatile and can be integrated into a wide range of imaging applications that could benefit from the high-speed and high-resolution single-photon camera data, including fluorescence microscopy, machine vision, and long-range 3D imaging. We experimentally demonstrate LiDAR imaging over a wide range of photon flux levels, from a sub-photon regime ($<1$ average signal photon per pixel) to extremely high ambient sunlight regime ($>20,000$ lux) and live-cell autofluorescence FLIM in extremely low photon count regimes ($10$ photons per pixel) where state-of-the-art methods fail to provide reliable lifetime estimates. We envision CASPI as a basic building block of general-purpose \emph{photon processing units} (analogous to conventional image processing units) that will be implemented on-chip in future single-photon camera sensor hardware.

Project: https://wisionlab.com/project/caspi/

[3] Scott Anderson, Tucker Bell, Patrick Egan, Nathan Weinshenker, and Paul Barford, “PowerPing: Measuring the Impact of Power Outages on Internet Hosts in the US”, To appear in Proceedings of the 17th IFIP International Conference on Critical Infrastructure Protection, Washington DC, March, 2023.

Abstract:

Power outage is a well-known threat to Internet communication systems.  While service providers address this threat via power-backup systems (e.g., diesel generators and UPS) in PoPs and datacenters, office buildings or private homes may not have similar capabilities. In this paper we describe an empirical study that assesses how power outages in the US impact end-host access to the Internet.  To conduct this study we built a system named PowerPing that monitors a power outage reporting website poweroutage.us) and measures end-host responsiveness in the affected areas.  We use PowerPing to collect power outage and end-host responsiveness data over one year from June 2020 through July 2021. We find that power outages that affect more than 10% of customers in a county occur at a rate of about 50 events/day; each outage typically affects about 3k customers, and service is typically restored in just under 2 hours.  We also report on the end-host responsiveness characteristics for typical power-outage events and on several major outage events that occurred during our study.  Surprisingly, we find only a weak correlation between power-outage impact and service restoration periods vs. end-host responsiveness. Our findings suggest that improving backup power for network devices in both the home as well as at the ISP may improve end-host connectivity during typical power outages.

[4] Some details are here: https://www.gem5.org/events/hpca-2023

The gem5 Tutorial will be held in the morning session and will focus on teaching those new to gem5 how to use the latest version. It will be “crash course” in gem5 and assume no prior knowledge of using computer architecture simulators. The tutorial will focus heavily on new features in gem5, such as the gem5 standard library, so may be suitable for those who have used gem5 before but wish to refresh their skills.

February 20, 2023

Prof Mohit Gupta received a Cruise Research Award for his proposal “Analysis of Single-Photon Sensors for Long-range 3D Imaging”. Well done! [1]

Prof Jelena Diakonikolas received the Meritorious Service Award from the journal Mathematical Programming for “exceptional diligence in their service” to the journal. Wonderful job! [2]

Prof Jin-Yi Cai and student had their paper “Dichotomy Result on 3-Regular Bipartite Non-negative Functions” appear in the journal Theoretical Computer Science. Great job! [3]

Prof Andrea Arpaci-Dusseau and Prof Remzi Arpaci-Dusseau received the UC Berkeley EECS Distinguished Alumni award for “outstanding innovation in the fields of storage and operating systems and their leadership in education and outreach”. [4]

Notes:

[1] Cruise Research Award (1 year)

Title: Analysis of Single-Photon Sensors for Long-range 3D Imaging

PI: Mohit Gupta

Abstract: We propose a theoretical analysis of single-photon 3D imaging cameras to assess their suitability for use in challenging scenarios, including sensor and scene motion, strong sunlight, and non-cooperative objects such as retro-reflectors. We will develop a probabilistic image formation model for coded single-photon imaging, and a mathematical framework for exploring and analyzing the space of coding functions. Based on this framework, we will derive physical and estimation-theoretic limits of this novel sensing modality via mathematical tools such as Cramer-Rao lower bound analysis. This foundational analysis will be summarized to provide practical guidelines on when (and whether) to use single-photon sensors (SPADs) for high-resolution, long-range 3D imaging under adverse sensing conditions.

[2] https://www.springer.com/journal/10107

Mathematical Programming rests on the knowledge and fortitude of its referees who willingly volunteer their time to provide timely, judicious, and considerate reviews. The Meritorious Service Award was created to acknowledge these continued efforts. In 2022 our Editorial Board assessed the referees who have demonstrated exceptional diligence in their service to the journal. After a detailed examination of all the referee reports, you have been selected to receive the 2022 Meritorious Service Award due to your outstanding contributions.

Congratulations! Your name will be listed on the journal website along with the other winners. We will also send you an award certificate as a token of our deep gratitude for your dedicated service.

[3] Dichotomy Result on 3-Regular Bipartite Non-negative Functions

Jin-Yi Cai and student Auten Fan

Theoretical Computer Science

We prove a complexity dichotomy theorem for a class of Holant problems on 3-regular bipartite graphs. Given an arbitrary nonnegative weighted symmetric constraint function f = [x_0, x_1, x_2, x_3], we prove that the bipartite Holant problem Holant(f | (=_3)) is either computable in polynomial time or \#P-hard. The dichotomy criterion on f is explicit.

[4] Distinguished Alumni Awards winners are selected each year by the EE and CS Chairs in consultation with the EECS Faculty Awards Committee and with input from the EECS faculty. Winners receive a plaque awarded at a departmental event such as the BEARS conference or the L&S CS Commencement.

This year’s info: https://eecs.berkeley.edu/research/bears/2023/distinguished-alumni

List of previous winners: https://eecs.berkeley.edu/people/alumni/cs-distinguished-alumni

Citation: For outstanding innovation in the fields of storage and operating systems and their leadership in education and outreach.

Arpaci-Dusseaus receive alumni award

February 13, 2023

Prof Mohit Gupta received a Sony Faculty Innovation Award entitled “Photon-Starved Scene Inference Using Single-Photon Cameras”. Great! [1]

Prof Jin-Yi Cai and students had the paper “The complexity of counting planar graph homomorphisms of domain size 3” accepted into STOC ‘23 and the paper “Bipartite 3-Regular Counting Problems with Mixed Signs” accepted into the Journal of Computer and System Sciences. Great job! [2]

Prof Matt Sinclair gave an invited talk at ORNL entitled “Characterizing and Harnessing Performance Variability in Accelerator-rich HPC Systems”. Well done! [3]

Notes:

[1] Sony Faculty Innovation Award

Title: Photon-Starved Scene Inference Using Single-Photon Cameras

PI: Mohit Gupta

Abstract: The goal of this proposal is to develop vision systems that can recover scene information under practical constraints such as limited power, latency and bandwidth. Imagine a robot exploring an underground cave, a robot arm performing inspection of dark machine parts, or a telescope imaging distant galaxies. In such extreme conditions, images captured by conventional cameras get overwhelmed by noise, causing the signal-to-noise ratio (SNR) to dip below the threshold required for downstream inference algorithms to extract meaningful scene information. To achieve these goals, we propose developing novel signal-processing, computer vision and machine learning algorithms for raw data captured by single-photon cameras. Broadly, the proposed work will consist of (a) designing algorithms for recovering low-level image features such as edges and corners directly from photon streams, (b) designing photon-processing and filtering algorithms for active single-photon imaging (e.g., to support long-range LiDAR), and (c) develop novel photon-stream representations for power- and bandwidth-efficient single-photon imaging (both passive and active scenarios).

[2] The complexity of counting planar graph homomorphisms of domain size 3

Jin-Yi Cai and Ashwin Maran

STOC 2023.

We prove a complexity dichotomy theorem for counting planar graph homomorphisms of domain size 3. Given any 3 by 3 real valued symmetric matrix H defining a graph homomorphism from all planar graphs G \mapsto Z_H(G), we completely classify the computational complexity of this problem according to  the matrix H. We show that for every H, the problem is either polynomial time computable or \#P-hard. The P-time computable cases consist of precisely those that are  P-time computable for general graphs (a complete classification is known) or computable by Valiant’s holographic algorithm via matchgates. We also prove several results about planar graph homomorphisms for general domain size q. The proof uses mainly analytic arguments.

and

Bipartite 3-Regular Counting Problems with Mixed Signs

Jin-Yi Cai and Auten Fan and Hugh Liu

Journal of Computer and System Sciences

We prove a complexity dichotomy for a class of counting problems expressible as bipartite 3-regular Holant problems. These are also counting CSP problems where every constraint has arity 3 and every variable is read-thrice. The constraint function can take both positive and negative values, allowing for cancellations.

[3] Characterizing and Harnessing Performance Variability in Accelerator-rich HPC Systems

Matt Sinclair talk at ORNL

Abstract: Scientists are increasingly exploring and utilizing the massive parallelism of general-purpose accelerators such as GPUs for scientific breakthroughs. As a result, datacenters, hyperscalers, national computing centers, and supercomputers have procured hardware to support this evolving application paradigm. These systems contain hundreds to tens of thousands of accelerators, enabling peta- and exa-scale levels of compute for scientific workloads. Recent work demonstrated that power management (PM) can impact application performance in CPU-based HPC systems, even when machines have the same architecture and SKU (stock keeping unit). This variation occurs due to manufacturing variability and the chip’s PM. However, while modern HPC systems widely employ accelerators such as GPUs, it is unclear how much this variability affects applications. In this talk, I will discuss our work on characterizing the extent of variation due to GPU PM in modern HPC and supercomputing systems across five large-scale computing centers with modern GPUs: Oak Ridge’s Summit, Sandia’s Vortex, TACC’s Frontera and Longhorn, and Livermore’s Corona. Regardless of the application, cluster, GPU vendor, and cooling method, our results show significant variation: 8% (max 22%) average performance variation even though the GPU architecture and vendor SKU are identical within each cluster, with outliers up to 1.5X slower than the median GPU. These results highlight the difficulty in efficiently using existing GPU clusters for modern HPC and scientific workloads, and the need to embrace variability in future accelerator-based systems.  In the second half of the talk, I will discuss our efforts to embrace this variability.