Improving the Reliability of Commodity Operating Systems Similar idea to the Rio file cache, but broader: isolate an extension, make it recoverable, and do so in a backward-compatible way. The strictly new part is the recoverability. Isolation is achieved largely by virtual memory protection. Each "lightweight protection domain" has its memory write-protected from domains other than the main kernel. Each domain has its own copy of the page table (must be synced with the kernel, which is what changes it). When ever there is a domain crossing (XPC), the page table is updated to this sync'ed version. This results in a TLB flush on the x86, which is the main source of performance degradation of Nooks. Much cheaper than a trap (change in H/W protection domain). However, it means extensions can execute privileged instructions. Could solve latter problem using SFI. I.e. don't let them in ways that are "bad!" Hard part is defining "bad" I suppose. Another part of isolation is implemented using wrapper stubs. These are functions that interpose between calls into and out of extensions. These perform XPC's. Accesses directly to objects may be converted into XPC's for better control. Otherwise, the extension modifies a shadow copy in its own domain, whose changes are synced with the actual copy following the XPC. Poor man's transaction. Are XPCs synchronized? It would seem they'd have to be. But this would require knowledge of the rest of the kernel, to know what locks protect which data structures? There is also an "object tracker." This keeps track of objects allocated/managed by the extension (and the kernel). XPC calls check addresses against legal objects that are tracked; kernel objects are deep-copied. Not sure how Nooks determines objects to track. Seems like the programmer needs to be involved here. E.g., how to know what an object's lifetime will be. Recovery largely depends on the object tracker, and well-understood recovery functions for particular objects. All objects known by the object tracker are freed, released, NULLed, etc. and the extension is unloaded and restarted. I have a hard time believing this is general! Limitations: can't prevent using privileged instructions (in kernel mode!); only prevent a limited form of infinite loop. Questions: What is this actually getting us? What are the impacts on security? They contend that this sort of service should be in regular use? Do you agree with this? What would you change about the service if you designed it from first principles? How could more language-oriented approaches, rather than object-tracking, provide a benefit? ---------------------------------------------------------------------- The Rio File Cache: Surviving Operating System Crashes Key ideas: avoid reliability-related file system writes to disk; instead, make RAM used for files reliable by: 1) adding battery backup to deal with power failures 2) protecting the the file cache using either SFI or VM tricks 3) supporting "warm reboot" in which old file cache memory is restored following a failure. Point 2) uses VM in that all all accesses to file cache pages go through the virtual memory system (i.e. the TLB and page tables, though they just say the TLB); we do not permit direct physical addressing. This way, we can set the pages as read-only except when they must actually be written to by the file cache subsystem. Using VM, rather than SFI, is great since it incurs no additional overhead (except the cost of actually flipping the bits on the table): this is because the kernel is already running in supervisor mode, and thus does not require a trap. Using SFI would add extra overhead. Protection in general is more useful than disk-related reliability mechanisms because any illegal attempt to write to a file cache page will be fail-stop: the system will panic immediately, rather than corrupt the datastructure and permit bad data eventually to get written to disk. Point 3) requires some reworking of the way the file cache is written so that the datastructure is always in a stable state. This can be done by making changes to "shadow copies" and then atomically linking the change into the persistent data structure. This is basically like general-purpose persistence, but without language support: no transactions, no garbage collection. However, language support would make a lot of sense, particularly transactions. The GC part is not needed since all RAM is persistent (though not all CONSISTENT), and thus we don't need the reachability part. Questions: How would you implement the system using SFI? They didn't really go into the details. Why should we think that SFI is a bad idea in this case, as compared to VM? How is this setting different than the one motivating SFI, SPIN, or Nooks? Does this approach suggest ways in which VM-based protection could be more performance-friendly? What would you have to do? What is the relationship between security and reliability? Is this paper all about reliability, or does it have security ramifications? How do we measure these? ---------------------------------------------------------------------- Extensibility and Safety in the SPIN Operating System Similar to Nooks, but from scratch, for the whole OS: co-location, enforced modularity and logical protection domains (i.e. isolation), and dynamic call binding (to support extensibility). Questions: They make the comment that apps still use VM and so on: Why not put everything in the kernel? Presumably we must still trust the extensions. What is the protection model? How does it relate to H/W-based protection? Capabilities Protection domains (i.e. interfaces + linking) Event-based extension model. Calling a function is raising an event; may have many handlers. Guards indicate which handlers can called. May side-effect expressios passed to mult. handlers! How does the SPIN memory management services interact with the Modula-3-based GC used for extensions themselves?