Documentation

UW Connect

PASTE '11

The PASTE workshop was collocated with FSE/ESEC this year and located in Szeged, Hungary. As a result (of either) both submissions and attendance were low. Two of the six papers were from the Paradyn group, leaving two long (8 page) and two short (4 page) papers accepted.

The workshop opened with a keynote by Sumit Gulwani of Microsoft Research. He is interested in making "coding" easier for the vast number of computer users that are not computer programmers (such as my wife). He noted that most of these people had "simple" repetitive tasks which are easily performed by scripts (such as VBscript) but that those scripts are hard to write and thus people do it by hand. He proposed a solution based on program synthesis, which automatically creates programs from a description of their input and desired output. The majority of the keynote focused on demos from an Excel extension that allowed users to provide some input (e.g., a list of dates in various formats) and a single output example (e.g., the corresponding day of the month) and would automatically create a program that would convert the former to the latter. If the computer generated an incorrect mapping a user could correct particular output examples to further refine the synthesized program.

Unfortunately, I'm putting this down as "too good to be true". While his examples were compelling, I'm unsure how general his tool is; from offhand comments it sounded like it was tuned very precisely for the particular examples and would fail if anything else was tried. That said, if it should show up in the next version of Office I will eat my words and buy Jenn a copy. Taking a step backwards, the concept of making programming easier for day-to-day users is an excellent area to investigate, and the idea of "writing a program" by providing an example mapping and then further tuning is a good one.

Update: This functionality is being added to Excel 2013. This is an amazing piece of work and will save a lot of people a lot of time. It's good to see an increase in computer usability for the general public! For more info, look here.

After the keynote we presented our work. Emily talked about unstrip, work she had done with Nate in recovering wrapper function names from their system call behavior. It was generally well received, and there was some interest expressed in Windows support. I discussed the Dyninst relocation engine (instrumentation with CFG transformations) and patching techniques. I received a general question about ARM support, as well as one looking at instrumenting embedded applications where memory usage was highly constrained. While this is an area that I am not particularly interested in, I think it is basically unexplored by modern instrumentation toolkits. I know that PIN et al will at a minimum double the size of the program, and Dyninst can approach that with pervasive instrumentation because it focuses on efficiency over memory usage.

After our presentations the other papers were presented.

The first of these, "Towards Systematic, Comprehensive Trace Generation for Behavioral Pattern Detection through Symbolic Execution", investigated the recovery of design patterns from source code automatically. The motivation is software maintenance; assuming that developers leave and documentation is poor, then simply understanding a body of code is a hard task. The author Markus von Detten noted that many programmers use design patterns (in the OO usage) either explicitly or implicitly, and if we could automatically recover the patterns behind a piece of software it would be much easier to understand.

I found this paper fascinating, both because it is an immediately useful result and because it was a good use of a favorite technique of mine, symbolic execution. The initial approach to deriving the patterns was trace based; take a trace and compare its method invocations to the expected calls for particular patterns. However, this approach is limited since a trace is a single example of what the program could do. He then used SE to broaden a single trace out to multiple possibilities. Normally SE has a significant problem with loops, as it may be impossible to determine how many times a loop executes. Markus addressed this problem by ignoring it and instead encoding "a loop occurred" in the behavior recognition. Overall, an interesting paper and result.

The second paper, "An Evaluation of Change-Based Coverage Criteria", focused on software testing, specifically improving code coverage-based tests. Code coverage is a standard testing metric; the idea in a nutshell is that if your tests exercise every line of code in a project, then clearly they entirely test the project. However, ensuring complete code coverage is a really difficult task. The authors instead propose change-based coverage in which the tests only attempt to cover lines of code that changed recently; the motivating concept is that changes are likely to introduce bugs and so focus on the changes. They find that change-based coverage is a useful heuristic (as clearly it's not complete) and that it is much easier to write localized change-based tests. In summary: developers, write tests for your changed code :)

I found this paper interesting in that they are trying to make code coverage more attainable, but less compelling than the first paper. Part of this was due to the presentation (right after lunch) and part was due to its lower immediate applicability; I am unsure how this would assist me in actually writing tests (other than the lower volume I am expected to write).

The third paper, "Locating Failure-Inducing Environment Changes", focused on determining whether the program environment has an effect on program failure. They note that the environment forms a silent partner with a program and its input when determining whether a program runs or fails, and propose a replay-based technique that determines which system call (as the environment interface) failed.

I found this paper less interesting due to its assumptions (statically linked, single threaded programs) and it's evaluation - the test cases were somewhat simplistic. Also, its major contribution seemed to be the record and replay technique, which I am certain has been treated in greater complexity in other areas (e.g., checkpoint/restore code).

The final paper, "Assessing Modularity via Usage Changes", is another paper based on assisting developers. Their idea is that code should be modular, which they define as allowing internal changes without requiring external users to change their code to suit. They propose this metric and describe a technique for automatically mining modularity from an existing codebase by looking at a revision history.

This was a short paper and thus light on detail, but I found the automatic nature of their technique interesting. It is often easy to feel that your code is modular without any proof, and as someone who has made his share of interface-breaking code changes (e.g., the recent relocation system and Kevin's work) a measurement technique like this would be helpful.