Thursday, November 10, 2011 11:00 AM, 1240 CS (Cookies: 10:30)
Abstract: The time and energy complexity of arithmetic operations are highly dependent on their input operands. Synchronous implementations tend to treat all input scenarios uniformly due to a uniform clock cycle period constraint. In this talk I will show how asynchronous circuits can exploit operand-dependent behavior to provide implementations of common arithmetic operations that are, on average, more efficient than conventional solutions.
David Bruton, Jr. Centennial Professor, Department of Computer Sciences, University of Texas at Austin
Phylogenetic placement arises in the analysis of metagenomic data, in which the objective is to insert short molecular sequences (called "query sequences") into an existing phylogenetic tree and alignment on full-length sequences for the same gene. Phylogenetic placement has the potential to provide information beyond pure species identification (i.e, the association of metagenomic reads to existing species), because it can also give information about the evolutionary relationships between these query sequences and to known species. We present SEPP, a general "boosting" technique to improve the accuracy and/or speed of phylogenetic placement techniques. The key algorithmic aspect of this booster is a dataset decomposition technique in SATe (Liu et al., Science 2009), a method that utilizes an iterative divide-and-conquer technique to co-estimate alignments and trees on large molecular sequence datasets. We show that SEPP improves current phylogenetic placement methods, placing metagenomic sequences more accurately when the set of input sequences has a large evolutionary diameter and produces placements of comparable accuracy in a fraction of the time for easier cases. Joint work with Siavash Mirarab and Nam Nguyen, PhD students at UT-Austin.
Department of Computer Science, Harvard University Friday, November 11, 2011 1:00 PM, 4310 CS (Cookies: 1:00)
A potential problem in outsourcing work to commercial cloud computing services is trust. If we store a large data set with a service provider, and ask them to perform a computation on that data set -- for example, to compute the eigenvalues of a large graph, or to compute a linear program on a large matrix derived from a database -- how can we know the computation was performed correctly? Obviously we don't want to compute the result ourselves, and we might not even be able to store all the data locally. This leads to new problems in the streaming paradigm: we consider a streaming algorithm (modelling a user with limited memory and computational resources) that can be assisted by a powerful helper (the service provider). The goal of the service provider is to not only provide the user with answer, but to convince the user the answer is correct.
In this talk, I will give a unified overview of a recent line of work exploring the application of proof systems to problems that are streaming in nature. In all of these protocols, an honest service provider can always convince the data owner that the answer is correct, while a dishonest prover will be caught with high probability. The protocols I will discuss utilize and extend powerful ideas from communication complexity and the theory of interactive proofs, and I will argue that many are highly practical, achieving millions of updates per second and requiring little space and communication.
Coralia Cartis Lecturer, School Of Mathematics, University of Edinburgh, United Kingdom Monday, November 14, 2011 4:00 PM, Wisconsin Institute for Discovery (WID), Room 3280B 3rd floor, teaching lab), Anyone without WID access can use the special events elevator on the WID 1st floor (near Aldo coffee shop) to access the WID 3rd floor teaching lab.
We show that the steepest-descent and Newton's methods for unconstrained nonconvex optimization under standard assumptions may both require a number of iterations and function evaluations arbitrarily close to the steepest-descent's global worst-case complexity bound. This implies that the latter upper bound is essentially tight for steepest descent and that Newton's method may be as slow as the steepest-descent method in the worst case. Then the cubic regularization of Newton's method (Griewank (1981), Nesterov & Polyak (2006)) is considered and extended to large-scale problems, while preserving the same order of its improved worst-case complexity (by comparison to that of steepest-descent); this improved worst-case bound is also proved to be essentially tight. We further show that the cubic regularization approach is, in fact, optimal from a worst-case complexity point of view amongst a class of second-order methods. Time permitting, the problem-evaluation complexity of constrained optimization will also be discussed, and shown to have, surprisingly, the same order as a function of the accuracy as in the unconstrained case. This is joint work with Nick Gould (Rutherford Appleton Laboratory, UK) and Philippe Toint (University of Namur, Belgium).
Rebecca Willett, Assistant Professor, Electrical and Computer Engineering, Duke University
Abstract: Online optimization methods are useful in a variety of applications with sequential observations of a dynamic environment. Often such methods are designed to minimize an accumulated loss metric, and the analysis techniques are appealing because of their applicability in settings where observations cannot be considered independent or identically distributed, and accurate knowledge of the environmental dynamics cannot be assumed. However, such analyses may mask the role of regularization and adaptivity to environmental changes. This work explores regularized online optimization methods and presents several novel performance bounds. Tracking regret bounds relate the accumulated loss of such an algorithm with that of the best possible dynamic estimate that could be chosen in a batch setting, and risk bounds quantify the roles of both the regularizer and the variability of the (unknown) dynamic environment. The efficacy of the method is demonstrated in an online Ising model selection context applied to U.S. Senate voting data.
Abstract: Come meet members of the UW Artificial Intelligence community! This is a chance for new and returning students to get a feel for AI research being done here at UW. It is also an opportunity to introduce students to professors and researchers they may want to work with in the future. This week with feature talks from Prof. Colin Dewey and two of his students, Bo Li and Laura Hobbes Legault. Questions, contact bgibson@cs.wisc.edu Light refreshments will be served.
Abstract: Graduate students from a variety of areas in the department will speak on their current research. The featured speakers are: Dan Szafir, Human Computer Interaction; Evan Driscoll, Programming Languages; Lance Hartung, Networking; Spyros Blanas, Databases; Danielle Albers, Graphics/Vision; Adrian Mayorga, Graphics/Vision. This is a great opportunity for graduate students at all levels to learn about some of the exciting research going on in the department. All are encouraged to attend TGIF at 4:30pm in CS 2310 following this event.
Benjamin Recht, Assistant Professor, Department of Computer Sciences, University of Wisconsin--Madison
Abstract: Deducing the state or structure of a system from partial, noisy measurements is a fundamental task throughout the sciences and engineering. The resulting inverse problems are often ill-posed because there are fewer measurements available than the ambient dimension of the model to be estimated. In practice, however, many interesting signals or models contain few degrees of freedom relative to their ambient dimension: a small number of genes may constitute the signature of a disease, very few parameters may specify the correlation structure of a time series, or a sparse collection of geometric constraints may determine a molecular configuration. Discovering, leveraging, or recognizing such low-dimensional structure plays an important role in making inverse problems well-posed. In this talk, I will propose a unified approach to transform notions of simplicity and latent low dimensionality into convex penalty functions. This approach builds on the success of generalizing compressed sensing to matrix completion, and greatly extends the catalog of objects and structures that can be recovered from partial information. I will focus on a suite of data analysis algorithms designed to decompose general signals into sums of atoms from a simple---but not necessarily discrete---set. These algorithms are derived in a convex optimization framework that encompasses previous methods based on l1-norm minimization and nuclear norm minimization for recovering sparse vectors and low-rank matrices. I will provide sharp estimates of the number of generic measurements required for exact and robust recovery of a variety of structured models. I will then detail several example applications and describe how to scale the corresponding inference algorithms to massive data sets.
College of Computing, Georgia Institute of Technology Monday, November 21, 2011 4:00 PM, 1240 CS (Cookies: 3:30)
“It is not from the benevolence of the butcher, the brewer, or the baker, that we expect our dinner, but from their regard for their own interest.” Each participant in a competitive economy is “led by an invisible hand to promote an end which was no part of his intention.” -- Adam Smith, 1776.
With his treatise, The Wealth of Nations, 1776, Adam Smith initiated the field of economics, and his famous quote provided this field with its central guiding principle. The pioneering work of Walras (1874) gave a mathematical formulation for this statement, using his notion of market equilibrium, and opened up the possibility of a formal ratification. Mathematical ratification came with the celebrated Arrow-Debreu Theorem (1954), which established existence of equilibrium in a very general model of the economy; however, an efficient mechanism for finding an equilibrium has remained elusive.
The latter question can clearly benefit from the powerful tools of modern complexity theory and algorithms. We will provide an in-depth overview of the fascinating theory that has emerged around this question over the last decade.
A compelling new issue is extending this deep understanding of markets to the digital economy -- because of some fundamental reasons, the methodology outlined above does not carry over to the digital realm. We will outline recent progress on this issue as well.
Vijay Vazirani got his Bachelor's degree in Computer Science from MIT in 1979 and his Ph.D. from the University of California at Berkeley in 1983. His research has spanned a broad range of themes within the design of efficient algorithms - combinatorial optimization, approximation algorithms, randomized algorithms, parallel algorithms, and most recently algorithmic issues in game theory and mathematical economics. He has also worked in complexity theory, cryptography and information theory. In 2001 he published what is widely regarded as the definitive book on Approximation Algorithms. This book has been translated into Japanese, Polish, French and Chinese. In 2007, he co-edited a comprehensive volume on Algorithmic Game Theory. During 2011-12 he is visiting Stanford University and Caltech under a Guggenheim Fellowship.
One of the key challenges in the data center today is the efficient use of data-center resources while providing services with service-level objectives (SLOs). The primary reason for the challenge is the dynamism in workload requirements over time. We propose dynamic instantiation of virtual caching appliances (caches as virtual machines) to handle the dynamism in workloads and thereby support storage SLOs efficiently. We have developed an SLO-based automation framework called Dynamite for cache instantiation, that includes low-overhead techniques to: determine the workloads that would benefit from caching, determine the appropriate cache size for these workloads, instantiate the cache and non-disruptively migrate the application, and finally warm the cache to quickly return to acceptable service levels. We have evaluated the effectiveness of the individual techniques using a variety of I/O traces and find that they are highly accurate despite approximations that significantly reduce monitoring overheads. And finally, the approach actually works! Using the complete pipeline on a case study involving interfering workloads shows that service levels can be met while utilizing resources efficiently.
Bio: Lakshmi N. Bairavasundaram is a member of technical staff in the Advanced Technology Group at NetApp. His research interests include storage systems, file systems, storage and data management, and fault tolerance. Lakshmi currently focuses on storage and data management abstractions and techniques. He joined NetApp after completing his Ph.D. in Computer Sciences (2008) at the University of Wisconsin-Madison under the supervision of Prof. Andrea Arpaci-Dusseau and Prof. Remzi Arpaci-Dusseau.
Abstract: Wiberg matrix factorization breaks a matrix Y into low-rank factors U and V by solving for V in closed form given U, linearizing V(U) about U, and iteratively minimizing ||Y - UV(U)||_2 with respect to U only. This approach factors the matrix while effectively removing V from the minimization. Recently Eriksson and van den Hengel extended this approach to L1, minimizing ||Y - UV(U)||_1. We generalize their approach beyond factorization to minimize an arbitrary function that is nonlinear in each of two sets of variables. We demonstrate the idea with a practical Wiberg algorithm for L1 bundle adjustment, the first algorithm for that problem. We also show that one Wiberg minimization can be nested inside another, effectively removing two of three sets of variables from a minimization. We demonstrate this idea with a nested Wiberg algorithm for L1 projective bundle adjustment, solving for camera matrices, points, and projective depths.
Bio: Dennis Strelow is an engineer on Google’s computer vision research team. He is interested in structure-from-motion, vision-aided navigation, sensor fusion, and image classification. His recent work includes Wiberg optimization, image feature hashing for improved image classification, and fast local descriptors for large-scale image and video classification. Previously, he worked at Quantapoint and Honeywell, and received his B.S., M.S., and Ph.D. degrees from the University of Wisconsin, University of Illinois, and Carnegie Mellon, respectively.
This is Micro 2011 practice talk for a paper co-authored with Gabriel H. Loh of AMD.
Die-stacking technology enables multiple layers of DRAM to be integrated with multicore processors. A promising use of stacked DRAM is as a cache, since its capacity is insufficient to be all of main memory (for all but some embedded systems). However, a 1GB DRAM cache with 64-byte blocks requires 96MB of tag storage. Placing these tags on-chip is impractical (larger than on-chip L3s) while putting them in DRAM is slow (two full DRAM accesses for tag and data). Larger blocks and sub-blocking are possible, but less robust due to fragmentation.
This work efficiently enables conventional block sizes for very large die-stacked DRAM caches with two innovations. First, we make hits faster than just storing tags in stacked DRAM by scheduling the tag and data accesses as a compound access so the data access is always a row buffer hit. Second, we make misses faster with a MissMap that eschews stacked-DRAM access on all misses. Like extreme sub-blocking, our implementation of the MissMap stores a vector of block-valid bits for each “page” in the DRAM cache. Unlike conventional sub-blocking, the MissMap (a) points to many more pages than can be stored in the DRAM cache (making the effects of fragmentation rare) and (b) does not point to the “way” that holds a block (but defers to the off-chip tags).
For the evaluated large-footprint commercial workloads, the pro- posed cache organization delivers performance that is within 92.9% of the performance benefit of an ideal 1GB DRAM cache with an impractical 96MB on-chip SRAM tag array.
Biography (since conference talks are usually given by grad students, not senior professors): Hill began graduate studies in 1981 at Berkeley with David Patterson and Alan Smith. Fortunately, he has finished. He worked on the SPUR multiprocessor workstation and his thesis included developing the 3C model of cache behavior (compulsory, capacity, and conflict misses) and identifying when direct-mapped caches can out-perform set-associative ones ("Big and Dumb is Better"). Hill subsequently joined Wisconsin as an assistant professor.
Department of Biostatistics and Medical Informatics
UW-Madison
----------------
Advanced-stage epithelial ovarian cancer is the leading cause of death due to gynecological malignancy and, unlike other cancers, the five-year survival rate has remained relatively constant for the last 30 years. This mortality is attributable to heterogeneous disease, frequent recurrence and acquired chemoresistance, all of which complicate the attempt to improve treatment outcomes and to generate models of the disease.
In the main, I will discuss a number of hypotheses derived from our analysis of the Cancer Genome Atlas' ovarian pilot project, a multi-modal and high-throughput molecular survey. Our focus on patterns of expression in frequently-mutated "core" signaling pathways motivates statistical models that profile individual disease, correlate with response to drug therapies and promise individual prognosis. Two shorter vignettes will discuss our early results from studies of specific pathways/mechanisms and developing statistical models of the clinical disease management process.
We generalize the ``marginal revenue'' approach to optimal mechanism design to environments with multi-dimensional agent preferences. This approach can be viewed as a reduction from multi- to single-agent mechanism design. For a single agent, marginal revenue describes the change in performance as a function of the ex ante probability that the agent is served. When the distribution of preferences is well behaved, the optimal mechanism for multiple agents is to serve the agents with the highest marginal revenue. This is a generalization of Myerson's (1981) virtual-value-based construction.
When the distribution of preferences is not well behaved, the marginal revenue approach fails, and so does any virtual-value-based construction. In these environments we consider performance as a function of the interim allocation rule. Given a construction of the performance-optimal mechanism for any interim allocation rule for a single agent, the optimal mechanism for multiple agents can be constructed. The construction requires joint feasibility of the interim allocation rule. This joint feasibility is known as Border's (1991) condition. In general, Border's condition has exponentially many constraints; however, we give a polynomial time separation oracle from which the optimal mechanism can be tractably constructed.
This work introduces a new processor architecture, the idempotent processor architecture, that allows speculative execution without the need for hardware checkpoints to recover from mis-speculation. Idempotent processors execute programs as a sequence of compiler-constructed idempotent (re-executable) regions, and the nature of these regions allows state to be reproduced by re-execution, obviating the need for hardware recovery support. The work builds upon the insight that programs naturally decompose into a series of idempotent regions and that these regions can be large.
The talk will cover how idempotent processor architecture simplifies the design of in-order processors. Idempotent processor architecture eliminates much of the complexities in modern in-order processors by allowing instructions to retire out of order with support for re-execution when necessary to recover precise state. Our quantitative results show that we obtain a geometric mean performance increase of 4.4% (up to 25% and beyond) while maintaining an overall reduction in power and hardware complexity.
************************************
Gagan Gupta: Dataflow Execution of Sequential Imperative Programs on Multicore Architectures
As multicore processors become the default, researchers are aggressively looking for program execution models that make it easier to use the available resources. Multithreaded programming models that rely on statically-parallel programs have gained prevalence. Most of the existing research is directed at adapting and enhancing such models, alleviating their drawbacks, and simplifying their usage. This paper takes a different approach and proposes a novel execution model to achieve parallel execution of statically-sequential programs. It dynamically parallelizes the execution of suitably-written sequential programs, in a dataflow fashion, on multiple processing cores. Significantly, the execution is race-free and determinate. Thus the model eases program development and yet exploits available parallelism. This presentation describes the implementation of a software runtime library that implements the proposed execution model on existing commercial multicore machines. We present results from experiments running benchmark programs, using both the proposed technique as well as traditional parallel programming, on three different systems. We find that in addition to easing the development of the benchmarks, the approach is resource-efficient and achieves performance similar to the traditional approach, using stock compilers, operating systems and hardware, despite the overheads of an all-software implementation of the model.
Come meet members of the UW Artificial Intelligence community! This is a chance for new and returning students to get a feel for AI research being done here at UW. It is also an opportunity to introduce students to professors and researchers they may want to work with in the future. This week with feature talks from
David Baumler
Genome Center of Wisconsin
UW-Madison
No other family of microorganisms has had a greater impact on human health then the Enterobacteriaceae, and these bacteria have evolved into a wide variety of commensal and human, plant, and avian pathogens. These organisms have diverged from a common ancestor ~300-500 million years ago (MYA), and little is known about ancestral metabolism. Using a paleo systems biology approach the metabolism of ancient microorganisms has been investigated through construction of metabolic models using either ancient genomic DNA (such as the Yersinia pestis genome that has been recently sequenced from human corpses that were victims of the 2nd pandemic of the black plague ~1300 A.D.) or through a comparison of 72 enterobacterial genomes of modern descendents.
I will present an analysis of the ability of these ancient metabolic models to utilize 300 substrates and how some of these metabolic strategies are used in numerous human niche locations where modern-day enterobacteria cause disease. Overall this work will demonstrate that models of ancient bacteria can be used to accurately predict metabolism and to derive new targets to control human disease.
Come meet members of the UW Artificial Intelligence community! This is a chance for new and returning students to get a feel for AI research being done here at UW. It is also an opportunity to introduce students to professors and researchers they may want to work with in the future. This week with feature talks from
You are cordially invited to attend the class project presentations for CS706: Analysis of Software Artifacts. Come learn about a variety of innovative approaches to helping humans build software that works. Attend as many or as few talks as you wish.
11:00am - 11:18am: Taming File Systems Resource Leak 11:18am - 11:36am: Package Dependency Customization and Visualization 11:36am - 11:54am: How to Diagnose Performance Bugs?
Rajeev Rastogi, VP and Head of Yahoo! Labs Bangalore
Abstract: The web is a vast repository of human knowledge. Extracting structured data from web pages can enable applications like comparison shopping, and lead to improved ranking and rendering of search results. In this talk, I will describe two efforts at Yahoo! Labs to extract records from pages at web scale. The first is a wrapper induction system that handles end-to-end extraction tasks from clustering web pages to learning XPath extraction rules to relearning rules when sites change. The system has been deployed in production within Yahoo! to extract more than 200 million records from ~200 web sites. The second effort exploits content redundancy on the web to automatically extract records without human supervision. Starting with a seed database, we determine values in the pages of each new site that match attribute values in the seed records. We devise a new notion of similarity for matching templatized attribute content, and an apriori style algorithm that exploits templatized page structure to prune spurious attribute matches.
You are cordially invited to attend the class project presentations for CS706: Analysis of Software Artifacts. Come learn about a variety of innovative approaches to helping humans build software that works. Attend as many or as few talks as you wish.
11:00am - 11:18am: Specializing Functions for Sliced Executables 11:18am - 11:36am: Toward Better Tools for Debuggers 11:36am - 11:54am: Fuzz Testing Google NaCl 11:54am - 12:12pm: MaliciOS: An Environment for Advance Identification of Runtime Failures
You are cordially invited to attend the class project presentations for CS706: Analysis of Software Artifacts. Come learn about a variety of innovative approaches to helping humans build software that works. Attend as many or as few talks as you wish.
11:00am - 11:18am: Static Verification of XPCOM Clients using Dehydra 11:18am - 11:36am: Semi-Supervised Statistical Debugging of Non-Fatal Errors 11:36am - 11:54am: Overcoming Concolic Testing Limitations 11:54am - 12:12pm: Accelerating Data Race Detection with Memory Ranges
All members of the department is invited to join a poster session with projects from the HCI Class. Cookies and chocolate will be served. Below are abstracts for the posters.
Effects of facial expression in video- and avatar-based computer-mediated communication
The goal of this study is to explore collaborative behavior and rapport in different modes of computer-mediated communication (CMC). We tested whether the well-described in literature positive effects of smile are present in video and 3D-avatar-based CMC. A between-subjects laboratory experiment compared the effects of smiles shown in prerecorded and rendered video sequences of a human and an avatar, respectively, displayed during the iterated prisoner’s dilemma game. Self-reported and behavioral measures included collaborative behavior of the participant, likability and rapport. Human game partners were perceived as more likable compared to avatars. Smiles of the game partner predicted likability and rapport, but not collaboration. Importantly, these effects were only significant for male participants. Other effects of gender were observed: females reported higher likability of both types of game partners, smiled for longer time. Also, males were more collaborative with human compared to avatar game partners.
Comparing Administration Methods of Psychological Instruments
Tony McDonald & Lauren Meyer
Most psychological instruments were designed and normed with paper and pencil administration, but many researchers are now using computers instead. Studies indicate there may be differences between these modes. How does the mode of administration affect instruments’ psychometric properties? How do participants perceive the experience? Twenty-seven UW-Madison undergrad and graduate students completed four instruments (Social Responsiveness Scale, Interpersonal Trust Scale, modified NASA Task Load Index, experience) in a 2 (mode: computer vs. paper) by 2 (location: lab vs. home) between subjects design. Data from the computer conditions (n = 15) were less reliable than from the paper conditions (n = 12), IPTS Cronbach’s α = 0.39 vs. 0.76. Participants took less time in the computer conditions, p = 0.004, and there was an interaction with location, p = 0.01. Participants in the computer conditions correctly judged the computer administration to be faster, p = 0.04. Participants prefer surveys on paper and in the lab, p < 0.001.
Do Only Humans Cheat? Exploring the Role of Cheating in a Video Game
J.J. De Simone, Tessa Verbruggen, & Li-Hsiang Kuo
In sports and board games, an opponent cheating is typically met with disdain, anger, and disengagement by the other players. However, work has yet to address the role of AI cheating in video games. Participants played either a cheating or non-cheating version of a modified open source tower defense game. Results indicate that when an AI competitor cheats, players perceive the opponent as being more human. Cheating also increases player aggravation, but does not impact presence and enjoyment of the experience. Game designers can integrate subtle levels of cheating into AI opponents without any real negative responses from the players. The data indicate that minor levels of cheating might also increase player engagement with video games.
Effects of robot's feedback under time pressure
Heemoon Chae, Mai Lee Chang, & Nisha Kiran
During human-robot interaction under task time pressure, a robot’s feedback must be carefully designed since humans are under more stress and have less tolerance of the robot’s mistakes. Thus far, no research has investigated the effects of a robot’s feedback under time pressure. Our study examines the effects of a robot’s feedback on performance, usefulness, and satisfaction under task time pressure. Participants were asked to find words in a Word Search. The independent variables included feedback (none vs. verbal) and time pressure (none vs. high). Feedback included both compliments and hints. The dependent variables were usefulness, satisfaction, workload, and words per minute. The results show that the participants found the robot to be useful, assisting, and raising alertness when the robot gave verbal feedback.
The Effects of Text and Robotic Agents on Deception Detection
Wesley Miller & Michael Seaholm
When people attempt to conceal the truth from others, they typically exhibit what are known as deception cues, identifiable indications that the person in question is speaking untruthfully. For this study, we were interested in seeing how well an individual can separate truth from falsehood when a small subset of these deception cues are exhibited by not only human, but also robotic and text-based agents. Participants were tasked with interacting with recordings of each agent in some predetermined order, asking them questions from a preset list. Participants then rate each response in terms of perceived truthfulness. Once they have interacted with all three agents, participants were asked to rate the overall trustworthiness of each agent. The results of our study indicate that participants were able to detect deception more reliably from the human agent and that statements given by the text-based agent were consistently ranked more truthful than the statements of the other agents.
An Adaptive Autonomous System for Second Language Acquisition
Young-Bum Kim, Pallavika Ramaswamy, & Soyoun Kim
Second language acquisition poses a unique challenge in that it combines elements of rote memory with skill integration and creative productions. To handle these challenges, we harness the power of frequent self-testing by designing a Computer Aided Language Learning System(CALL) system. We explore whether people can learn a second language, by evaluating learner responses to a language translation task, with our CALL system. Our CALL system can automatically assess the responses and also adapt the questions according to the level of the learner. We comment about the efficiency of our adaptive CALL system with a one-size-fits-all approach and see how it affects learning curve of people with different knowledge bases . To this end, we show how we can (1) estimate the semantic and syntactic difficulty of a sentence, (2) autonomously grade the response, and (3) select problems according to the assessed level of the learner. In an uncontrolled online experiment, we asked the participants to translate English sentences into Korean sentences. A third of the participants are given randomly chosen questions and assessed manually, and another third of the participants are given randomly chosen questions but are assessed with our autonomous assessment system. A final third of the participants are given adaptively chosen questions according by our system. With this information, we evaluate how the adaptive autonomous assessment framework affected the participants' task performance.
The CS736 (Advanced Operating Systems) class will present posters of their final projects. Come hear about this great collection of projects:
Improved APIs and Programming Models for Deterministic Multithreading, Aarti Basant, James Paton, Seth Pollen Abstract: Dthreads is a deterministic multithreading library which conforms to the pthreads API and programming model. However, the pthreads API was originally designed to handle nondeterministic threading, and using it to control a deterministic library masks the actual semantics and introduces needless complexity. We propose a new, simplified API for deterministic multithreading, consisting of just two calls which lock and unlock a globally shared mutex. We show that this new API, when combined with the semantics of Dthreads, is sufficiently powerful for general use and can be used as the basis for more complex and powerful synchronization operations, such as condition variables and barriers.
We then evaluate the new API by incorporating its use into the PARSEC benchmark suite. We present benchmark data to support the conclusion that our API changes do not negatively impact performance. In addition, we present a case study of a PARSEC program that is broken by linking to the Dthreads library. We show that a Dthreads-aware port of the program can fix the problem, albeit at a cost to performance. We conclude by outlining avenues for future work.
Race-Guided Random Testing for Multithreaded Applications, Dongdong Deng, Xiaoyang Gao
Abstract: Exposing concurrency bugs in multithreaded applications is difficult because of large interleaving space. Also, buggy interleaving may occur with extremely low probability, making repeated executions of an application inefficient to trigger concurrency bugs. With our race detector, we can focus on the instruction pairs directly related to data races, reducing the interleaving space significantly. To expose concurrency bugs, we suspend the execution of an application by inserting random length of delay into randomly selected instructions. We analyzed the causes of concurrency bugs as well as buggy interleaving in four multithreaded applications and our experiment results show that we can discover data races and expose concurrency bugs effectively in these applications.
The Pedigree of Linux, Zev Weiss, Matthew Wolfe
Abstract: We have performed a source-level study of the variations between different versions of the Linux kernel as packaged with six modern Linux distributions. Our goal was to provide a summary of the primary differences that have emerged between operating systems all nominally referred to as "Linux".
We performed a detailed examination of the source code modifications made in each of the respective kernels, six in total. We report on our findings by presenting a broad sampling of new features added to the Linux kernel by each distribution, which are often quite recognizably relevant to that particular distribution's intended use. Ultimately we find that there are significant differences in the feature sets offered by the various versions of the kernel, but do not see any evidence of divergence to an extent that would break backward compatibility or cause compatibility problems for portable programs (so long as they were not written specifically to use a particular new feature offered on a certain distribution).
Tree-Based Density Clustering using Graphics Processors, Evan Samanas, Benjamin Welton
Abstract: Tree-Based Overlay Networks (TBONs) have proven to be a powerful and scalable communication model for building tools for massively parallel and distributed systems, but this model is less proven for building applications. We present as a case study our software that uses the TBON model to efficiently execute a popular clustering algorithm usually run on a single node while utilizing Graphics Processing Units (GPUs) in a TBON. Our non-GPU implementation provides up to a 23x speedup at 32 nodes over the algorithm running on a single node, while a non-TBON based solution is able to achieve 20x speedup before declining. A 7x-23x speedup is also seen in using a single GPU to run the same clustering algorithm. We then give an example use of clustering at large-scale by attempting to track the movement and outbreak of influenza in the United States.
Implementing a Data race detector with the Dyninst API, Xiaozhu Meng, Luke Pillar, Nairan Zhang
Abstract: Data races are difficult to manually debug, so a tool that can detect Data races automatically is quite useful to writing multi-threaded programs. In this paper, we describe the implementation of a Data race detector using the DyninstAPI, an efficient binary-instrumentation tool. Our Data race detector implements a modification of the Vector Clock algorithm, based on the “happened-before” partial-ordering relationship. We implement this algorithm for the pthread library, instrumenting the pthread synchronization primitives. In our tests, our detector reports data races successfully. We conduct a comprehensive study on the performance and scalability of our detector and give an analysis of performance bottlenecks. We believe that our detector has the potential to be used on real programs after some further optimization.
A procfs Interface for Debugging in Linux, Srinivas Govindan, Sathya Gunasekar
Abstract: Many Unix based operating systems provide support for debugging through their procfs interface. Linux however, provides a debugging support through the ptrace system call, while at the same time it exposes a process through its procfs interface to some extent. We extend the current procfs interface in Linux to support all the debugging features provided by ptrace and also provide additional features enabling a debugger to perform better. Debuggers can read/write to specific files in /proc to perform the debugging tasks. Our debugging interface can read/write to the process address space, registers and supports tracing specific system calls that are explicitly of interest to the debugger. We also enable the debugger to choose not to be notified about specific signals delivered to a traced process that it is not interested in. Our simple debugging interface performs as good as ptrace for normal debugging and can perform better while tracing system calls. It also provides much faster reads and writes to the address space of a process compared to ptrace which only allows access one word at a time.
Debugging support through /proc file system in Linux, Victor Bittorf, Igor Canadi
Abstract: Using the virtual file system in Linux, we implemented debugger primitives through /proc file system. Our new interface is intended as replacement for the existing ptrace system call. We discuss differences and advantages of our model. We created a proctrace library to semi-transparently emulate ptrace using our interface. While we cannot perfectly emulate ptrace, we did have some success in porting strace. We also created our own tracer, itrace, which can be run simultaneously with gdb.
Reverse Engineering of Data Structures using Dyninst,Eric Lederer, Srividya Doss
Abstract: The knowledge of a program’s data structures along with their syntactic and semantic definitions can be highly valuable in a variety of forensic and security applications. In this paper, we attempt to infer rich information regarding the types of these data structures from only the binary executable of a program. We first instrument the binary with the the Dyninst API such that we can monitor data flow across memory locations and registers when the instrumented binary is executed. At runtime, each memory access is tracked by a shadow array, and we construct a graph of dependencies between tracked memory locations. We use the REWARDS [1] dynamic analysis technique to automatically reverse engineer data structures from this information by propagating type information across these dependencies between memory locations. The key to type resolution is the execution of a type-revealing instruction called a “type sink,” typically system calls and standard library functions with well-defined parameter and return type information. We have a developed a prototype tool, Pumpkin, which outputs a visualization of the data structures present in the binary’s memory address space and the layout of data structures within the binary’s custom function stack frames.