Mid-Atlantic Programming Language Seminar in Conjunction with NJPLS

Friday, November 18, 2005


Directions

Here are directions to the CS department at UMCP. We will have most of the meeting in the Computer Science Instructional Center (CSI), and lunch in the A.V. Williams (AVW) Building, where the department is located. Here is location information on CSI and AVW. Click the "Location" link to see a campus map for each building. Here are floorplans of CSI.


Agenda

Arrival and Coffee: 9:30 - 10:00

Session 1: 10:00 - 11:30 CSIC 3120
Trevor Jim (AT & T Labs Research)
Yakker: A Parser Generator for Network Protocol Messages
Polyvios Pratikakis (University of Maryland, College Park)
Context-sensitive Correlation Analysis for Detecting Races

Lunch: 11:30 - 12:50 AVW 1152

Business meeting: 12:50 - 1:00 CSIC 2120

Session 2: 1:00 - 2:30 CSIC 2120
Zhaozhong Ni (Yale University)
Certified Assembly Programming with Embedded Code Pointers
Jinlin Yang (University of Virginia)
Terracotta: Mining Temporal API Rules from Imperfect Traces

Break: 2:30 - 2:45

Session 3: 2:45 - 3:30 CSIC 2120
Paritosh Shroff (Johns Hopkins University)
Program Analysis with Flow-Effect Types


Abstracts

Trevor Jim (AT&T Labs Research)
Yakker: A Parser Generator for Network Protocol Messages

Many network servers use handwritten decoding routines to parse protocol messages. Bugs in these routines are common, and they are the favorite target of malware authors, who use them to propagate worms, install zombies, and crash servers.

To address this problem I am developing Yakker, a highly-automated parser generator for network protocol messages. Yakker takes an IETF Request for Comments (RFC) as input, and produces a parser as output. Yakker's parsers are implemented in a safe language, to prevent buffer overflows, and they have a simple construction, so they can be safely adapted to languages such as C. Using Yakker, I will demonstrate how to build a validating IMAP proxy from RFC 3501 and less than a dozen IMAP-specific lines of code.


Polyvios Pratikakis (University of Maryland, College Park)
Context-sensitive Correlation Analysis for Detecting Races.

Multi-threaded programming is likely to become more widespread as parallel computing resources become more common. To avoid bugs and keep programs simple, programmers typically wish to avoid data races, in which two threads could access a shared location simultaneously. A common idiom for preventing races is to ensure that a consistent mutual exclusion lock "l" is held every time a shared location "r" is accessed. In this case, we say that "r is guarded by l". We present a type-based inference algorithm to infer the "guarded-by" relation in multi-threaded programs, to detect potential data races. The algorithm aims to be sound, and works by building a constraint graph for the program, and then solving the constraints to infer acquired locks, shared variables and guarded-by relations for every point in the program, which we use to check for races.

The novelty of our approach is its use of correlation and existential quantification in these constraint graphs. The former is needed to model "guarded-by" while the latter adds precision. The analysis is context-sensitive, implemented using context free language reachability. We have implemented the system for C. The tool performs reasonably well, checking program up to 80KLOC in under 40 minutes. Using the tool we have found many races in various applications, with a false positive ratio between 0 and 75%.

Slides from the talk. Joint work with Jeff Foster and Michael Hicks.


Zhaozhong Ni (Yale University)
Certified Assembly Programming with Embedded Code Pointers

Embedded code pointers (ECPs) are stored handles of functions and continuations commonly seen in low-level binaries as well as functional or higher-order programs. ECPs are known to be very hard to support well in Hoare-logic style verification systems. As a result, existing proof-carrying code (PCC) systems have to either sacrifice the expressiveness or the modularity of program specifications, or resort to construction of complex semantic models. In Reynolds's LICS'02 paper, supporting ECPs is listed as one of the main open problems for separation logic. In this paper we present a simple and general technique for solving the ECP problem for Hoare-logic-based PCC systems. By adding a small amount of syntax to the assertion language, we show how to combine semantic consequence relation with syntactic proof techniques. The result is a new powerful framework that can perform modular reasoning on ECPs while still retaining the expressiveness of Hoare logic. We show how to use our techniques to support polymorphism, closures, and many other language extensions and how to solve the ECP problem for separation logic. Our system is fully mechanized. We give its complete soundness proof and a full verification of Reynolds's CPS-style ``list-append'' example in the Coq proof assistant.

Slides from the talk. This is joint work with Zhong Shao (Yale University). Preprint of related paper (POPL'06) can be found at http://flint.cs.yale.edu/flint/publications/xcap.html.


Jinlin Yang (University of Virginia)
Terracotta: Mining Temporal API Rules from Imperfect Traces

A major hurdle to widespread adoption of program verification is the lack of program specifications. Temporal specifications constrain the order of occurrence of events. Satisfying them is essential for a program's correctness. We developed a dynamic analysis technique to automatically infer temporal specifications of a program. We generate execution traces of the program by running a set of test cases and then analyze the properties satisfied by traces using a set of predefined patterns. In this talk, I will first explain reasons why scaling dynamic inference techniques has proven difficult. Then I will introduce solutions that enable a dynamic inference technique to scale to large programs and work effectively with the imperfect traces typically available in industrial scenarios. I will present our approximate inference algorithm, heuristics for winnowing the large number of properties inferred to a manageable set of interesting properties, and report on experiments using inferred properties. We have implemented our dynamic analysis technique in a prototype tool called Terracotta. We evaluated it on JBoss and the Windows kernel. For JBoss, our tool showed the conformance of JBoss to the J2EE specification. For Windows kernel, we inferred many of the properties checked by the Static Driver Verifier and discovered a significant bug in Windows by checking the inferred properties with the ESP verifier.

Slides from the talk.


Paritosh Shroff (Johns Hopkins University)
Program Analysis with Flow-Effect Types

In this talk I will present some insights of our new flow sensitive type system for higher order programming languages. Flow-effect types are a novel form of type that combine the notion of temporal ordering inherent in type effect systems, with subtype constraint systems which focus on unordered data-flow. The resulting system achieves a high level of precision by cutting very close to the operational behavior of programs. The naive type closure algorithm completely simulates expression computation, and so is undecidable. The main technical result is development of a sound and decidable closure algorithm for flow-effect types based on a novel prune-rerun technique, to achieve bounded recursion.

Slides from the talk. Joint work with Christian Skalka and Scott Smith.