CMSC 412, Spring 2010

Project 1
Due: Friday, February 19th, 2010 6:00 pm

Overview

In this project we will be augmenting the GeekOS process services to include (1) background processes, (2) being able to kill a process asynchronously from another process, and (3) being able to view the currently running processes and their status.

Preliminaries

We will first provide some background on process lifetimes in GeekOS, and some more details on how system calls are implemented. The project requirements are presented bellow. Following the requirements, we present further background material on how process address spaces are implemented, many details of which are not important for this project, but will be more important for project 4 (so it might be useful to understand at least some of them now).

Process Creation and Termination

As you already know, a user process in GeekOS is essentially a Kernel_Thread with an attached User_Context. Each Kernel_Thread has a field alive that indicates whether it has terminated (e.g., whether Exit() has been called). In addition, a Kernel_Thread has a refCount field that is used to indicate the number of kernel threads interested in this thread. When a thread is alive, its refCount is always at least 1, which indicates its own reference to itself. If a thread for a process is started via Start_User_Thread with a detached argument of "false", then the refCount will be 2: one self-reference plus a reference from the owner. When detached is false, the owner field in the new Kernel_Thread object is initialized to point to the Kernel_Thread spawning it (aka the parent). Typically, the parent of a new process is the shell process that spawned it.

The parent-child relationship is useful when the parent wants to retrieve the returned result of the new process using the Wait() system call. For example, in the shell (src/user/shell.c), if Spawn_Program is successful, the shell waits for the newly launched process to terminate by calling Wait(), which returns the child process's exit code. The Wait system call is implemented by using thread queues, which we explain below.

When a process terminates by calling Exit, it detaches itself, removing its self-reference. Moreover, when the parent calls Wait, it removes the other reference, bring refCount to 0. When this is the case, the Reaper process is able to destroy the thread, discarding its Kernel_Thread object. Any process that is dead, but has not been reaped, is called a zombie. The reasons for this could be many, one instance being the parent failing to release its refCount: bug or otherwise.

More about process lifetimes: Zombies

A process can be in one of four states on its way from being alive to being dead:

refCount=0, alive=false

This process is a zombie that's "totally dead," as the child has done Exit to reduce its refCount, and if it had a parent at all, it reduced its refCount too. Thus, the process will soon be reaped.

refCount=1, alive=false

The process has called Exit(), but the parent hasn't called Wait(). In this case, the process is also a zombie, but is not on the graveyard queue.

refCount=1, alive=true

The process is a background process, and is alive and well.

refCount=2, alive=true

The process is a "foreground" process, and is alive and well.

Thread Queues

As processes enter the system, they are put into a job queue. In particular, the processes that are residing in the main memory and are ready and waiting to execute are kept on a list called the run queue. It stays there, not executing, until it is selected for execution. Once the process is allocated the CPU and is executing, one of several events can occur:

The process could issue an I/O request and then be placed in an I/O queue. For example, suppose the process makes an I/O to a shared device, such as a disk. Since there may be many processes in the system, the disk may be busy with the I/O request of some other process. The process therefore may have to wait for the disk. The list of processes waiting for a particular device is called the device queue. Each device has its own device queue.
The process could create a new subprocess and Wait for the subprocesses termination. In which case it goes into the wait queue for that process (which is defined in the Kernel_Thread struct) by calling the join() routine in kthread.c.
The process could be removed forcibly from the CPU, as a result of a timer interrupt, and be put back in the run queue.

In the first two cases, the process eventually switches from the waiting state to the ready state and is then removed from its I/O queue and put back in the run queue. A process continues this cycle until it terminates, at which time it is not present on any queue.

System Calls

Sometimes a program will need to interact with the system in ways that require it to access other memory, or otherwise perform privileged operations. For example, if the program wants to write to the screen, it may need to access video memory, which will be outside of the process's segment. Or it may need to use privileged instructions, such as I/O instructions, that only the OS can perform on its behalf.

The operating system therefore provides a series of System Calls, also known as Syscalls, which are routines that carry out some operation for the user process that calls it. But since these routines are themselves in protected memory, the OS provides a special mechanism for making syscalls.

In order to make a syscall in GeekOS, a user program sends a processor interrupt, using the int instruction. GeekOS has provided an interrupt handler that is installed as interrupt 0x90. This routine, called Syscall_Handler (src/geekos/trap.c), examines the current value in the register eax and calls the appropriate system routine to handle the syscall request. The value in eax is called the Syscall Number. The routine that handles the syscall request is passed a copy of the caller's context state, stored in struct Interrupt_State, so the values in general registers (ebx, ecx, edx) can be used by the user program to pass parameters to the handler routine and can be used to return values to the user program.

The syscall routines are defined in src/geekos/syscall.c: Sys_Null, Sys_Exit, etc. In general, syscall routines will do the following:

Extract any parameters passed by the user process. These are passed in the registers, and are saved in the user context passed to the system call. They can be accessed via state->ebx, state->ecx, etc.
Implement the logic of the syscall
Return the result (or the appropriate error code)

Before returning from the syscall, some OS code must restore the user context so that the program can continue running where it left off. A pointer to the stored copy of the context - on the kernel stack - is actually what is passed to the Syscall_Handler.

Further Reading: More information about system calls more generally can be found in Chapter 2 of the text.

Project Requirements

There are three primary goals of this project:

Add "background" processes
Support asynchronous killing of background processes.
Support printing the process table (i.e., information about the currently-running processes)

Add background processes

As explained above, in order for a process to be reaped, a parent must Wait() on the child process's pid. However, there may be situations in which the parent would like to let the child process just complete on its own and die a graceful death, without having to Wait() on it. To do this, we need a way to spawn a child "in the background," so that the parent can continue on with its work, oblivious of what the newly spawned process is doing. To implement background processes, do the following:

· Modify the Sys_Spawn system call to expect an additional argument from the user process, which is whether to spawn in the background or not. Remember--the Sys_* functions take only one argument, an Interrupt_State, so if you want to pass another argument, you'll need to modify the macro defining the system call to put another value into the registers before INT 0x90 is called. A background process is one that starts life detached; i.e., it has no parent, and thus its refCount starts at 1. In addition, a background process, and all of its child processes (even if spawned as "foreground" processes), should not be allowed to read input characters. In particular, the GetKey system call should fail, returning -1. In turn, the Read_Line library routine callable from user programs should return -1 if it fails to get a key. The shell should be modified to handle this possibility (since it calls Read_Line). A parent process will not be able to Wait on any process it spawns in the background; the Sys_Wait system call will return -1 (or some more descriptive/appropriate negative value) in this case. Note, however, that there is no restriction on background processes spawning, and waiting on, foreground processes (which according to the above-mentioned restriction should not be able to read input).

· Modify the src/user/shell.c to handle forking processes in the background. Modify the code to parse a command to look for '&' as the last character, and if so, adjust the call to Spawn() accordingly. Finally, the shell should print [PID] after it spawns a background process, where PID is the process ID of the spawned process. For example, a background spawn of the process "null.exe" at the shell would look something like the following:

$ null.exe &

[10]

$

Killing processes

It could be that once a background process, or any other process, starts to run, it may behave badly, or the work it is performing may become irrelevant. Therefore, we would like to have some way for one process to kill another process. To do this, do the following:

· Implement the Sys_Kill() system call (a stub appears in src/geekos/syscall.c) that takes a PID as its argument. The semantics are that the given process is killed immediately. (Note that this is unsafe in a general setting, since that process might hold shared resources, but there aren't any shared resources in GeekOS. Wait until project 3!).

This is different from a thread calling Exit(). Note that when a thread is executing it is not in any queue and is only referenced in g_currentThread. Therefore when doing the cleanup of a thread that called Exit by itself it makes sense to not consider any queues. But an asynchronous kill of a process can happen at any time. Particular example scenarios are for instance when the process is waiting for its child process to die and so is in the wait queue for that process. Or when it is in the runQueue of the system. Therefore when doing this asynchronous kill you will need to ensure that you properly remove the process from all queues it is in.

Also consider what should happen with a killed process's child processes: Their parent pointer is now invalid, and so they should be adjusted accordingly. Indeed, the same thing should happen for Exit, but does not in the implementation we provide---it turns out that this field is not used by the child process after its parent dies, so it can be left dangling. However, in your modified code, you will be using the parent pointer to print the process table, so both Exit and Kill should behave similarly, nulling the dangling parent pointer.

There is an interesting design point here: if a parent dies and fails to Wait() on its children, should we also decrement the children's refCount? If this does not happen, the child will remain a zombie, so decrement the counter.

When determining which processes can be killed; a process that falls in category II above (sec "More about process lifetimes: Zombies") should be allowed to be killed. This would happen because a child has died but its parent has not yet done a Sys_Wait, and we want Sys_Kill to be able to clean up the system nonetheless.

Add a src/user/kill.c user program. This should take one or more command-line arguments, each of which is a process PID, and invoke the Kill system call on each, returning 0 if all of the system calls succeed, or the result of the first system call that fails (the program must terminate at the first such failure). The Kill system call stub in user space has been defined for you; its prototype appears in include/libc/process.h.

Printing the process table

Now that we can run many processes from the shell at once, we might like to get a snapshot of the system, to see what's going on. Therefore, you will implement a program and a system call that prints the status of the threads and processes in the system:

Implement the system call Sys_PS that returns relevant information about the current processes. The user should pass a pointer to an array of Process_Info structs, along with the size of the array. These structs are defined as

struct Process_Info {

     char name[128];

     int pid;

     int parent_pid;

     int priority;

     int status;

};

Here, pid and parent_pid should be self-explanatory. The "name" part comes from the program argument to Spawn(); for kernel processes this should be "{kernel}" by default. The "status" field should be 0 for runnable threads (ie threads that are in the runQueue or actually running), 1 for blocked (ie threads that are waiting on some I/O queue, or the queue of a child process), and 2 for zombie (ie threads that are no longer alive but have not yet been reaped). The proper #defines for these, and the above struct, are set up for you in include/geekos/user.h, which is included from include/libc/process.h. Finally, priority is the priority number of the process, with respect to scheduling. You can get this information from the Kernel_Thread and User_Context structs, though you may need to augment them.

When printing out the status of the process, it should be considered a zombie if it falls in category 1 or 2 above (in sec "More about process lifetimes: Zombies") ---that is, a process is a zombie if the alive field is false.

Add a src/user/ps.c user program. This takes no arguments and should execute the Sys_PS system call to extract the process information and print it out. For the status field, print R for runnable or currently running, B for blocked, and Z for zombie. Format the output as in the following:

PID PPID PRIO STAT COMMAND
  1    0    1    B {kernel}
  2    0    1    R {kernel}
  3    0    1    B {kernel}
  4    0    1    B {kernel}
  5    0    1    B {kernel}
  6    1    1    B /c/shell.exe
  7    0    1    B /c/forktest.exe
  8    7    2    R /c/null.exe
  9    7    2    R /c/null.exe
 10    7    2    R /c/null.exe

The PS system call stub in user space has been defined for you; its prototype appears in include/libc/process.h. Your process table must have space for at least 50 entries. Please use "%3d %4d %4d %4c %s" as the format string to achieve the formatting in the table as shown. Failure to use the format string may cause tests to fail.