CMSC 838Z Spring 2005 Final Exam Due May 18, 5pm (either e-mail to me, or drop off with Dr. Foster, Room 4129 AVW). When completing this exam, you may use any of the course readings, Wiki comments, or notes you have taken. You may also refer to reference materials mentioned in class (notably, you can refer to either of the Pierce books for reference). DO NOT SEARCH AROUND FOR ANSWERS ON THE WEB. Most of the questions are discussion questions. For your sanity and mine, keep your answer to each question to one or two paragraphs. Commit to your answer and defend it, rather than enumerate all possibilities. If you have any questions, e-mail me. You can feel free to modify this exam in-line with your answers. You can also send me PDFs of pictures you draw, text you write, etc. Whatever is easiest for you. There are 120 total points on the exam. Good luck! ---------------------------------------------------------------------- QUESTION 1 (Software-based Fault Isolation) 30 points total Software-based Fault Isolation (SFI) can be used to create "sandboxes" so that certain code segments are limited to operating within their own sandboxes. In the SFI technique, the original program is rewritten to properly sandbox it. For this question, you will write a formal description for sandboxing of assembly language programs defined next. Here is the syntax: r ::= r1 | r2 | ... | rk (registers) v ::= (operands) n (integer constants) | r (registers) i ::= (instructions) r := v op v | r := Mem(v) | Mem(v) := v | jump n | if v jump n op ::= + | - | <= | >= ... (operations) A program consists of a memory M, which is a map from integers to integers, and a register file R, which is a map of registers to values. The value stored at address n in memory M is written M(n), and similarly the value of register ri in register file R is written R(ri). We assume there is a partial function decode() that maps integers to instructions (don't think too hard about addresses being a fixed number of bits; rather assume that all instructions can be encoded as a single integer). For simplicity, jumps are always to constant addresses. With this, the operational semantics of our assembly language can be written as a rewriting relation (M,R) --> (M',R'). In the inference rules below, we write E() as a generic function from operands to integers, where E(ri) = R(ri) E(n) = n We write R[ri = e] to define a new register file that is the same as R except that ri is now mapped to e. We do similarly for memory M. Here are the rules: decode(M(R(pc))) = "ri := v1 op v2" n = E(v1 op v2) pc' = R(pc) + 1 ----------------------------------- (OP) (M,R) --> (M,R[ri = n,pc = pc']) decode(M(R(pc))) = "ri := Mem(v)" n = M(E(v)) pc' = R(pc) + 1 --------------------------------- (LOAD) (M,R) --> (M,R[ri = n,pc = pc']) decode(M(R(pc))) = "Mem(v1) := v2" n1 = M(E(v1)) n2 = E(v2) pc' = R(pc) + 1 ---------------------------------- (STORE) (M,R) --> (M[n1 = n2],R[pc = pc']) decode(M(R(pc))) = "jump n" pc' = n --------------------------- (JUMP) (M,R) --> (M,R[pc = pc']) decode(M(R(pc))) = "if v jump n" n' = E(v) where n' != 0 pc' = R(pc) + 1 -------------------------------- (BE-FALSE) (M,R) --> (M,R[pc = pc']) decode(M(R(pc))) = "if v jump n" n' = E(v) where n' = 0 pc' = n -------------------------------- (BE-TRUE) (M,R) --> (M,R[pc = pc']) (20 points) Given a memory M whose first N addresses map to instructions, define a rewriting function Isolate() that maps M to a new memory M'. You can assume that none of the instructions being rewritten use certain registers, which you can therefore use safely in the mapped program M'. Of these, there are two that define the fault domain in which the program is being isolated: the first is the segment start address (SSA), and the second is the segment size (SS). If M' = Isolate(M), then M[1..N] should correspond to a program in M'[SSA...SSA+SS] whose instructions should never read, write, or jump to an address not in the range SSA to SSA+SS. For simplicity, do not change any constants in Mem[v] expressions (i.e. if v is a constant n, leave n alone). You can assume that SS will be sufficiently large to support the rewriting. Also, you don't need to worry about cross-fault-domain RPC. Define Isolate() in terms of an inductive function Rewrite(Mpc,...), where Mpc is the current address in M being considered; i.e. Rewrite(Mpc,...) fault-isolates the instruction that results from decoding M[Mpc]. You'll need to keep track of some auxilliary information as part of the "..." in Rewrite(). For example, you will need to keep track of a mapping from instructions in M to their equivalent in M', so you can properly rewrite jump targets. ANSWER: (10 points) Show the output of your Isolate function for the following memory M, where SSA is 100 and SS is 100. 0: r1 := Mem[150] 1: r2 := Mem[151] 2: r3 := 0 3: if r1 jump 7 4: r3 := r2 + r3 5: r1 := r1 + (-1) 6: jump 3 7: Mem[150] := r3 ANSWER: QUESTION 2 (Typed Assembly Language) 20 points total (10 points) Compare the above assembly language instructions with the instructions defined in TAL-0 (Figure 4-1 of the TAL chapter). What is the substantive difference between the two? How does this difference affect the security policy enforced by TAL as compared to SFI? ANSWER: (10 points) What makes dynamic memory allocation and arrays difficult to handle in TAL and PCC? ANSWER: QUESTION 3 (Model-carrying code) 20 points total (20 points) Model-carrying code (MCC) is an approach to enforcing security policies in untrusted code that is related to proof-carrying code (PCC). Draw a picture of the PCC approach and a picture of the MCC approach that succinctly highlights their similarities and differences. Add any necessary explanation to explain your figure. Be sure to consider both in terms of concept and current implementation. ANSWER: QUESTION 4 (Information flow control) 20 points total (10 points) A standard property of information flow systems is non-interference. Formally state this property and assess its usefulness as a security policy. ANSWER: (10 points) Software fault isolation in its full generality results in a program consisting of two or more fault domains. (In the SFI paper, discussion was simplified to a "trusted domain" and an "untrusted domain" where the untrusted domain was not allowed to write to the trusted domain's memory.) Assume you have a way to generate multiple domains, and that you have a lattice that indicates the relative trust between the domains. For example, given domain s1 and domain s2, if s1 <= s2 then s2 can read/write/jump to s1's code/data, but not the reverse. However, s1 is still permitted to call into s2 via a safe RPC, as defined in the SFI paper. Is this scheme implementing information flow control? To what extent? Justify your answer. ANSWER: QUESTION 5 (Static analysis) 30 points total (10 points) What is the relationship between flow-sensitivity, path-sensitivity, and context-sensitivity in static analysis? ANSWER: (10 points) What is the benefit of a system for intrusion detection via static analysis? In what sense is the analysis static? How does such a system relate to a system that detects a particular fault, such as a buffer overrun? ANSWER: (10 points) We have seen static analysis techniques for source code and for machine code. What are the advantages and disadvantages of the two approaches? ANSWER: