The Lockset algorithm in Locksmith
Locksmith combines several static analyses into several phases. The main phases are:
- Typing: Initially, we use
CIL to parse
the input C program and generate the AST. Then we apply the
Locksmith type system to generate data flow and control flow
constraints. We call these the label flow and
control flow graphs. The label flow graph includes the
constraints for computing the possibly shared locations in the
program.
Our publications
present a type system similar in principle to the implementation,
presented for an example language, a lot simpler than C.
We use
Banshee, a
constraint solving engine, to represent and solve the constraints
forming the label flow graph.
- Shared analysis: The next step is to compute all shared
locations. This is done by solving and then intersecting two
effect sets at each fork point.
Specifically, whenever a new thread is created, we compute all
the locations accessed (pointers dereferenced) by the new thread,
and similarly all the locations accessed by the parent thread.
Intersecting these two gives the set of locations that could be
shared between the two threads.
- Lock state: The third phase in the Locksmith implementation
computes the state of each lock at every program point. In order
to do this, we apply a data flow algorithm using the control flow
graph generated at the typing phase. This computes a set of
acquired locks for every program point. The computation is
conservative, because it might report that a lock is released at
a given program point, even if it could be acquired.
- Correlation: Next, we scan every dereference of every
pointer that points to a shared location, and correlate
the location with the acquired locks at the dereference's program
point. Using these correlations we compute the locks that
consistently protect each shared location throughout all
its dereferences, or the protection set for that location.
If that set is empty, then this is reported as a possible data race.
|
|