CMSC 838Z: Tools and Techniques for Software Dependability

Instructor Michael Hicks
CSI 2120 TuTh 3:30-4:45pm
Office Hours TuTh 10:00-11:00am AVW 4131

Syllabus / Schedule / Resources / Wiki


Project Ideas

  1. Dynamic Property Checking.  A number of recent research projects, including Vault, ESP, and CQual, support property checking on tracked objects based on finite state machines.  For example, the FSM for a lock has two states, locked and unlocked, such that a lock in the locked state transitions to the unlocked state when the lock is acquired, and then transitions back when the lock is released.  Similar FSMs can be defined for the states of a socket file descriptor, a filehandle, or other APIs.  All of these systems are based on purely static analysis, and rely on alias analysis to safely perform strong updates on the states of tracked objects.  In particular, an object can only switch states if it is linear, in the sense the system must know exactly which pointers are aliases to the object (most often, this means just a single pointer). However, programming in a linear style can be quite awkward.

    One way to overcome the conservatism of static analysis is to permit dynamic checking.  This is a common approach for null pointers and array indexes: the static checking system is too weak to prove all reasonable programs are safe, so it sometimes verifies an expected condition dynamically (i.e., the pointer is not null, or the index is within bounds).  We can generalize this idea to property checking systems.  In particular, we can associate with a given tracked object a dynamic state based on the user-provided FSM, and usages and transitions of the tracked object will necessitate a dynamic check.  Better, we can perform these checks only when not able to prove the necessary conditions statically.  That is, if objects are linear, then no checks are needed, but once an object becomes non-linear, we must perform dynamic checks.

    Your task: design a dynamic property-checking system, and implement it in the language of your choice (Cyclone, Vault, OCaml, Java would make good choices, since they are safe).

  2. Dynamic Memory Management.  Garbage collectors free objects only when they can never again be used by the program, based on the notion of reachability.  Garbage collectors add overhead to the program whenever they run, since they trace the program pointer graph to determine what is reachable. It is possible that garbage collection performance could be improved by allowing users to free objects immediately, without waiting for the garbage collector to collect them.  One way that programmers do this in safe languages like Java is to write an allocator in the safe language that always hands out objects of the same type T.  The problem with this approach is that, while type safe, it can lead to dangling pointers, since one part of the program can free an object while it is still in use.  One way to deal with this problem would be to use dynamic checks.  In particular, one could associate an "age" both with a pointer and the object pointed to.  Whenever an object is freed, the object's age is increased by one.  Whenever a pointer is dereferenced, its age is compared against the age of the object.  If the ages differ, then we have a dangling pointer.

    Your task would be to implement this basic idea as an automatic analysis, and to optimize it.  In particular, the basic approach will be pretty slow because each dereference will necessitate a dynamic check.  Some possible optimizations would be to (1) avoid checks altogether when you can prove an object is linear, (2) have a "lock" on the object that permits using it check-free until it is unlocked, preventing it from being freed while it is locked.

  3. Safe Device Drivers. The NOOKS project at the University of Washington aims to make device drivers safer by using a form of software fault isolation.  This allows the OS to be protected from a faulty driver, but can inflict a significant performance hit.  Safe language technology is an alternative.  Try using a safe language we have explored in class, like Cyclone or Vault, to implement a device driver, and do a performance study, comparing it to NOOKS.

  4. Whole program inference.  Improve/extend Cyclone's porting tool from C programs to Cyclone. While Cyclone is quite similar to C, the fact that it reveals underlying implementation details (like the layout of memory, stack allocation, whether a pointer requires bounds checks, etc.) makes porting C code to Cyclone somewhat tedious. Cyclone has a simple porting tool, which uses constraint-based program analysis to aid in this process, but it is primtive and brittle. The goal of this project would be to generally improve the tool.   One particularly interesting idea would be to perform region inference to make good use of Cyclone's region-based memory management.  In general, you could use the tool as is on a variety of programs, see where it falls short, and think about how to improve it.

  5. Combining regions with explicit object deallocation.  Cyclone's memory management uses regions, which are memory areas whose objects are deallocated all at once. Recent work has shown that regions can be made more effective by also allowing individual object deallocation as well.  The basic idea would be to extend Cyclone's support for unique pointers to apply to individual regions, rather than a single global region.  The same thing would work for reference-counted regions.  This approach would require some changes to the Cyclone type system, but not many, and additions to the runtime system.