Research Summary
My research focuses on binary analysis, instrumentation, and modification. A binary is a program form that consists of only the code and data that is necessary to execute, and may not contain any other semantic information (e.g., symbols, debugging information, and the like). Binaries are particularly challenging to modify because they lack this information; any modification of a binary may easily cause undesirable behavior such as incorrect output or crashes. However, binary instrumentation is also foundational to many other areas of research, such as performance analysis (via tracing), code steering, cyberforensics, program auditing, and attack detection. As a result this is a challenging yet rewarding area of research.
My latest published work is involved with using transformations of the control flow graph (CFG) to change execution of a program. In previous work, we assumed that the modifications we perform of the binary (to insert new code, or instrumentation) had no effect on the original CFG and execution of the program; we seek to remove this restriction. Modifying program execution is useful in a wide variety of areas: for example, adding a new execution path could be used to insert error handling code into an existing program; modifying existing paths could be used to inject new instrumentation; and removing unwanted paths could be used to circumvent unwanted security checks in malware. The areas of interest in this work are: first, what characteristics do we require the CFG to maintain at all times; second, what aspects of control flow can be changed via the CFG; and third, what constraints to we wish to impose on any inserted code.
Previously, I investigated using formal analysis to guarantee that instrumentation does not affect the original visible behavior of the binary. We define visible behavior in terms of denotational semantics and provide a stricter approximation that is decidable with static analysis. This approach allows us to insert code into highly defensive binaries, such as self-checksumming programs or malicious programs, without triggering their tamper-resistance techniques.
