Locksmith - Algorithm Description

The Lockset algorithm in Locksmith

Locksmith combines several static analyses into several phases. The main phases are:

Typing: Initially, we use CIL to parse the input C program and generate the AST. Then we apply the Locksmith type system to generate data flow and control flow constraints. We call these the label flow and control flow graphs. The label flow graph includes the constraints for computing the possibly shared locations in the program. Our publications present a type system similar in principle to the implementation, presented for an example language, a lot simpler than C. We use Banshee, a constraint solving engine, to represent and solve the constraints forming the label flow graph.
Shared analysis: The next step is to compute all shared locations. This is done by solving and then intersecting two effect sets at each fork point. Specifically, whenever a new thread is created, we compute all the locations accessed (pointers dereferenced) by the new thread, and similarly all the locations accessed by the parent thread. Intersecting these two gives the set of locations that could be shared between the two threads.
Lock state: The third phase in the Locksmith implementation computes the state of each lock at every program point. In order to do this, we apply a data flow algorithm using the control flow graph generated at the typing phase. This computes a set of acquired locks for every program point. The computation is conservative, because it might report that a lock is released at a given program point, even if it could be acquired.
Correlation: Next, we scan every dereference of every pointer that points to a shared location, and correlate the location with the acquired locks at the dereference's program point. Using these correlations we compute the locks that consistently protect each shared location throughout all its dereferences, or the protection set for that location. If that set is empty, then this is reported as a possible data race.