Today's session consisted of two parts: a review of the TAL0 concepts discussed in the previous session, and a motivational introduction to the TAL1 concepts that will be discussed in the next session. Part 1: TAL0 Review The goal of Morrisett's TAL work is to prove useful safety properties of assembly programs. To do so, he first defines the semantics of a limited assembly language "TAL0" in terms of an abstract machine. He carefully defines that abstract machine so that malformed programs become "stuck" at some point during execution. An example of a malformed program is one that tries to add an int and a label. There is no rule in the definition of the abstract machine that allows the abstract machine to carry out this operation -- instead, the abstract machine becomes "stuck". Given these TAL0 semantics, Morrisett goes on to demonstrate how an analyst can use a type system to prove that a given program is "well-typed". When executed on the abstract machine, well-typed programs do not become "stuck". Non-well-typed programs could (but may not) become stuck. At this point we had a review of Figures 4.1--4.4 from the Morrisett chapter, followed by an definition of "type safe" programs as programs that have the properties of "progress" and "preservation": A program has the "progress" property iff |- (H,R,I) then either I is HALT or there exists (H',R',I') such that (H,R,I) -> (H'R'I'). or informally: if the program is in a good state (H,R,I), the current instruction is either HALT, or the program can reach another good state (H',R'I') by executing one more step. In the case of TAL0, there is no HALT instruction; we encode this as L: jump L. A program has the "preservation" property iff if |- (H,R,I) and (H,R,I) -> (H',R',I') then |- (H',R',I') or informally, given that the program is in a good state (H,R,I) from which it can reach a new state (H',R',I'), then this new state (H',R',I') must also be a good state. Note that for TAL0, H' always is the same as H since there is no heap allocation. At this point we worked through a typing proof for the simple program "l1: jump l1". Unfortunately, I didn't get it all down in time, but the gist of it seemed to be as follows: o When you do a TAL0 typing proof, the bottom-most rule in your proof tree will be the (S-MACH) rule. (S-MACH) is the rule that concludes that your program is well-typed. o All the rules except for (S-GEN) and (S-INST) are syntax-directed. That means, except for (S-GEN) and (S-INST), you can always tell which rule to apply by looking at the instruction or construct with which you are currently dealing. If you're working on a jump, try (S-JUMP). If you're working with a register, try (S-REG). o The goal of the proof is to show that a program is well-typed, but... how do you determine what the actual type (shown as PSI in figure 4-4) is? The answer is you take an educated guess, and if you can't prove it for the initial type you guessed, make another guess and try again. o You can use the (S-GEN) and (S-INST) rules at any time in your proof to generalize a concrete type into a polymorphic type (GEN), or to make a concrete type from an abstract type (INST). When a given series of TAL0 instructions doesn't care about the types of some of the registers, give those registers a polymorphic type. That way, anyone can call that instruction sequence, regardless of what type they presently have in those "don't care" registers. Part 2: TAL1 Teaser The second part of the session introduced TAL1 - a version of TAL0 expanded to represent types for data allocated dynamically on the heap and the stack. We talked about some of the programs you could write in TAL1 that you couldn't in TAL0 - most notably programs that do (non-trivial) recursion. TAL1's new features also introduce the need to handle problems with pointer aliasing - cases where two pointers reference the same object in memory. Example: imagine the case where two registers R1 and R2 both contain the address of the same word in memory: r1 = r2 = (1); /* the type for r1 and r2 is (int) */ But then we use one register to change the type of that word in memory: mem[r1] := (label); /* r1's type is now code(...), r2's is still (int) */ This instruction will type-check but will get stuck during execution: r3 := mem[r2]; /* r2 still thinks it points to int, but it's wrong! */ r3 := r3 + 1; /* I'm stuck - can't add an int and a label! */ because mem[r2] = mem[r1] = some label, not an int. The TAL1 solution to this pointer aliasing problem is to put a restrictive discipline on the way pointers work to control aliasing. Specifically, TAL1 pointers come in two flavors: "unique" and "shared". TAL1's type system does not allow you to copy a unique pointer. (That is, a program that does so will not be well-typed.) Consequently, unique pointers can have no aliases. A program can change the type of a unique pointer and no problems will result, because unique pointers have no aliases. Shared pointers can have aliases. TAL1's type system prevents soundness problems by forbidding a program to change the type of a shared pointer. (That is, a program that does so will not be well-typed.) The restrictions on unique pointers make it difficult to write programs. Consequently, TAL1 includes a COMMIT pseudo-instruction that coerces a unique pointer into a shared pointer. There is no corresponding coercion the other way. TAL1 models the stack. When you do a typing proof of a TAL1 program, you must assign a type to the stack. The stack's type will generally start with a polymorphic component to stand in for "whatever was on the stack before the function I'm working on now was called". For example, a function that takes two int arguments might expect the stack to have this type: FORALL RHO . ( int, int, code(...), RHO ) where RHO stands in for the stuff pushed by previously called functions, the two ints are the function's arguments, and the code(...) is the address to which the function should return when it is done. Using a polymorphic type like RHO also ensures that the function can't misbehave by modifying the data previously pushed on the stack for other functions: you won't be able to construct a proof that instantiates RHO to a concrete type, and without a concrete type, any instructions that use the data represented by RHO won't type-check. Moreover, because the size of RHO is not known, you cannot even make a copy of it during the function's execution. - Tim Fraser