Jelena Diakonikolas receives NSF CAREER Award

By Rachel Robey

By applying optimization methods to machine learning tasks, Diakonikolas is “creating the theory from the bottom up.”

Professor Jelena Diakonikolas

Assistant Professor Jelena Diakonikolas, a researcher in the field of large-scale optimization, recently received a National Science Foundation (NSF) CAREER Award for her proposal “Optimization and Learning with Changing Distributions.” The award comes with $700,000 in funding to support research and education initiatives through 2029.

“It’s a big milestone, and it signals that the type of work I do is appreciated by the scientific community,” says Diakonikolas, who describes UW–Madison’s Department of Computer Sciences as “rare, maybe even unique” for having an optimization research group. “Because I primarily study optimization, my research profile is atypical of the projects funded by the NSF’s Algorithmic Foundations program. Getting this feedback is both rewarding and validating.”

Creating the theory from the bottom up 

For Diakonikolas, who joined the CS faculty in 2020, receiving the CAREER Award comes on the heels of being distinguished through the Air Force Office of Scientific Research (AFOSR) Young Investigators Program. While her broad areas of expertise are the theory and mathematics of optimization algorithms, Diakonikolas’ CAREER proposal looks at a more recent addition to her research portfolio: the machine learning applications of optimization methods.

“My CAREER proposal addresses basic learning tasks like learning a single neuron and linear classification in situations involving changing distributions,” says Diakonikolas. This work expands on previous research with collaborator (and husband) Professor Ilias Diakonikolas, graduate students Nikos Zarifis, Puqian Wang, and Shuyao Li, and former UW–Madison postdoc Sushrut Karmalkar. 

“Though these are seemingly simple problems, once you start looking at them closely, it turns out they’re quite challenging.” Diakonikolas continues. “However, the insights from our past work are forming a very strong foundation for studying them.”

The primary goal of the proposal is to build out a missing body of knowledge and a theoretical framework to address these preliminary learning tasks. By “creating the theory from the bottom up,” Diakonikolas plans to one day build up to more complex machine learning systems and problems that may benefit applied research.

Addressing a common problem in data science research

Early inspiration for the project came several years ago, when Diakonikolas, an affiliate with UW–Madison’s Data Science Institute (DSI), met with collaborators in DSI and the College of Agricultural and Life Sciences (CALS) to discuss climate-smart agriculture and forestry, a growing area of interest for researchers and practitioners across Wisconsin. Throughout conversations, Diakonikolas noticed a pattern in the obstacles encountered by these experts: the theoretical machine learning results were not specific enough to address the dynamic, sometimes imperfect, real-world data sets used in applied research.

For example, a climate-smart agricultural operation might depend on irrigation systems informed by sensors reporting on soil conditions—yet it would be impossible to have sensors everywhere. The result is a gap in the data. 

“In this hypothetical scenario, you want to be able to predict the soil conditions even in places without sensors, so you need a solution that optimizes performance across all possible locations,” says Diakonikolas. “Here we tacitly assume that conditions in unobserved locations are similar (although distinct from) the observed conditions and look for a solution that is competitive with the worst case conditions.” This kind of modeling technique, which seeks to find the optimal solution to a problem by optimizing for the “worst case” data, is known as “distributionally robust optimization.” 

Another recurring problem Diakonikolas noticed fell into the realm of “performative prediction,” a conceptual framework suggesting that as soon as solutions are introduced to a problem, they change the problem itself. For instance, taking an intervention to address climate change (e.g., planting more trees) could affect the data used to decide on an intervention.

Both distributionally robust optimization and performative prediction are existing areas of study with widespread applications, but Diakonikolas found that neither could get to the heart of the real-world problems encountered by researchers in CALS—and beyond. “These frameworks would make some very general assumptions about the problem we’re trying to solve, but they were not capable of addressing even the most basic learning tasks,” she says.

Diakonikolas’ CAREER proposal begins to bridge that gap. “I observed this common theme of what was missing in the theoretical framework for actually addressing the problems they [the CALS researchers] cared about,” she says. “These issues are pervasive across disciplines and affect applications from image recognition to e-commerce. The results from this work will be of benefit to researchers across many fields.”