Syllabus

CMSC 838Z: Tools and Techniques for Software Dependability


Instructor Michael Hicks
CSI 2120 TuTh 3:30-4:45pm
Office Hours TuTh 10:00-11:00am AVW 4131

Syllabus / Schedule / Resources / Wiki

Most researchers and practitioners have come to the realization that as software systems pervade nearly all aspects of society, software's dependability has become a paramount concern.  As a result, we are seeing a proliferation of tools designed to improve the quality of software.  These tools employ a wide variety of techniques, including type-checking, theorem proving, constraint-based analysis, model checking, dataflow analysis, and dynamic monitoring, to name a few.  In this course, we will seek to understand the technical aspects and mathematical theory behind these analysis techniques, and examine their utility in particular tools.  Students will have an opportunity to use and critique a variety of open-source tools, and do a non-trivial project towards the goal of improving software quality.

Topics

We will consider a variety of program validation techniques.

  • Type checking.  Traditional views of types are as descriptors of data format, e.g. as integers, strings, and pointers.  While types in this view have an unquestioned benefit on software dependability, types can be used to also express higher level program invariants.  For example, linear type systems can ensure that a program always deallocates resources it allocates, thus preventing leaks.  Types can be used to prevent race conditions and deadlock.  We will examine a tools and approaches that aim to use types to improve software dependability, including Cyclone, CQual, race-free Java, and others.  We will assume a familiarity with basic type systems, as taught in CMSC 631.
  • Model Checking.  Model checking was first popularized as a means to hardware verification, but recent advances have illustrated that it can be effectively applied to software.  We will explore the approaches behind model checking, and experiment with freely available tools, like Blast, SPIN, and SMV.  We will also consider the relationship between model checking, testing, and type systems.
  • Theorem Proving.  A wide variety of tools use automated theorem proving to establish the truth of certain properties in programs.  We will examine both the tools that employ theorem provers, like ESC/Java and Blast, as well as theorem provers themselves, including Twelf and Coq.  A familiarity with first-order logic is assumed (though there will be some review).
  • Static program analysis.  All of the above are essentially different kinds of static program analysis.  We will look at other forms of analysis of whole or partial programs, such as syntactic pattern matching, used in tools like findBugs and xgcc, and constraint-based analysis, as used in Ccured and CQual.  We will consider the cost and quality of various forms of analysis.
  • Dynamic analysis.  A drawback of all of the static techniques above is that they must assume various levels of conservatism because they consider all possible executions of a program.  By contrast, dynamic analysis examines particular runs of a program at a finer granularity.  We will examine how dynamic analysis can be used to discover program properties, as in Daikon, and then possibly feed back to inform static analysis.

The material we cover will be both theoretical (i.e. formalisms of relevant systems, mechanisms, etc.), and practical, in that we will actually use a variety of tools and understand their implementation.  The first few weeks of the course will cover some basic theoretical aspects, and then the remainder will consist of an interplay between learning concepts and using tools that employ them.  Students will implement a project in the second half of the course.

Prerequisites

The course requires students to have taken CMSC 631 or CMSC 838Y, or else acquire instructor approval.

Class Structure and Grading

The majority of the course will consist of reading and discussing web-available technical papers on the above topics, and in using tools.  In general, the course (and grade) will have four elements (subject to change up until class starts):

  • Class Participation. (15%) This consists of discussion on the course weblog regarding the papers we read and the tools we use, discussion in class, and a presentation the student will give about a particular tool or technique.  Presentation guidelines are given below.
  • Homeworks. (20%) Think of these as "take home exams."  Homeworks will develop your skills in the more theoretical aspects of the course, though some programming may also be required.  Homeworks will be a strong preparation for the final exam.
  • Project. (40%) The student will propose a project idea roughly 1/3 of the way through the semester, to be finalized 1/2 of the way through, and then completed by the end of the semester. Students may also propose to work in groups of two, but the expectation of the project quality/difficulty will be twice as great. Project ideas will be posted at a later time.
  • Final Exam. (25%) The final exam will be comprehensive and, along with the homeworks, will count for comp credit.

Project

The project will take place in two stages:

  1. Project proposal. The proposal can either be written (roughly 1 page) that indicates your thoughts on projects, or oral (i.e. meet with me in person). You should indicate what areas you are interested in exploring, and how you hope to explore them. The latter part should express your general feeling as to methodology (e.g. whether you want to do something theoretical or practical, whether you want to implement something, do a survey, etc.). You should then justify both the relevance of the project to the course (why will you learn something by doing it?), and also the magnitude of the effort (can you get it done this semester, or at least get some reasonable piece done that fits into a more ambitious effort?).

  2. Do the project. Projects should at the least have a written component, reporting what you accomplished. My belief is that the best projects will have both an implementation and analytic component.  Chances are, you will want to build on the area in which you presented.  Project ideas will be posted.

Presentations

Each student will give one lecture about a particular tool or technique, starting at roughly halfway through the semester. Presentation quality is important: read your papers early and be sure you fully understand them. I will be available for help, if you need it. Be sure that you have prepared a good presentation (consider practicing it beforehand). 

Presentations will be judged based on the following criteria (horked from Chris Hawblitzel's course syllabus):

  • understanding: does the presenter understand the material?
  • thoughtfulness: does the presenter have insights and opinions beyond what was in the paper?
  • background/perspective: did the presenter read background papers?
  • clarity: can the audience understand the presentation? is the "big picture" clear? are there useful examples?
  • materials: do the slides or use of blackboard illustrate and support the talk? are there diagrams to help convey the technicalities? (when your talk gets into deep territory, a diagram is worth 10K words)
  • delivery: has the the presenter practiced?
  • non-regurgitation: did the presenter do something beyond simply typing sections of the paper as bullet points? did the presenter motivate the ideas in their own words or just state ideas from the paper verbatim?
  • answering questions: can the presenter handle questions from the audience?
Remember that you will likely be able to explain more detail than you can hope to cover in a single lecture. This is one reason that it's hard work to prepare a good presentation: not only do you need to understand the paper, but you need to filter out the irrelevant details and amplify the key arguments. You'll probably have omit entire sections of the paper from your talk -- don't worry about it. Simply mimicking the structure of the paper ("regurgitating it") tends to produce a disconnected sequence of boring facts. A good talk should tell a story; every idea should be motivated, and all facts should fit together in a coherent picture. Telling such a story in a short time often requires creating your own explanations, motivation, and examples. 

Final Exam

The final exam will likely be a take-home exam; in essence a longer, comprehensive homework.

Academic Dishonesty

The college policy on academic dishonesty is strictly followed.  All graded materials (whether exams, summaries, presentations, or projects) must be strictly individual efforts.  In the case of a group project, only collaboration amongst the group is permitted.