Project 2

CMSC 412

Due Friday October 5, at 6:00pm

Overview

Project 2 requires you to implement signals and signal handlers. A signal is a process-level interrupt. A process can send a signal to another process. The process receiving the signal will, at some point, stop what it is doing, execute a signal-handler function, and then resume what it was doing.

Quick Links

Signals

A signal is an inter-process communication mechanism that allows one process to invoke a signal-handler function in another process. Each process maintains a table of (function pointers to) signal handlers, one for each signal the process can handle. A process uses a "signal" system call to manipulate this table, i.e., to register a signal-handler function for a signal number. A process uses a "kill" system call to send a signal number to a process identified by its pid. The signal number represents an index into the table of signal handlers in the target process.

When a signal is sent to a process, the kernel arranges for the signal handler function to be called within the process. The slightly tricky part is to ensure that when the signal handler returns, control is passed back to the kernel which arranges for the process to resume from wherever it was. To accomplish this, we define a "return signal" system call that is to be invoked by the process when its signal handler returns. There is also an "initialization" system call to inform the kernel of the user-side of the return signal syscall. In implementing signals, you will need to to arrange for the process to have both a context when executing the signal handler and a saved context that the signal handler will return to after execution. In preparation, we describe context switching next.

1. Background

1.1. Context Switching

To give the impression of kernel threads running concurrently, the kernel gives each thread a small time quantum to run. When this quantum expires, or the thread blocks for some reason, the kernel will context-switch to a different thread.  To do this, it must save the state of the currently-running thread, load the state of the thread to switch to, and then start running the new thread.  The code to switch to a new thread is written in assembly code, in the routine Switch_To_Thread.

Two important considerations are: (1) where should I save the thread context during a context-switch? (2) what should this context consist of?  These questions are answered in turn.

1.2. The Kernel Stack (Thread Stack)

The kernel stack is the stack used by a Geekos kernel thread while it is executing in the kernel.  As usual, the kernel stack stores the local variables used by the kernel thread while running GeekOS kernel routines.  This could be for kernel threads performing system processes, like the reaper thread, or it could be for kernel threads implementing user processes, executing system calls on their behalf.

When performing a context switch, the current state (or "context") of a thread is saved on its kernel stack.  This state consists of the current values of (most of) the registers.  The fields stackPage and esp defined in the Kernel_Thread structure, specify where the thread's kernel stack is (esp is the kernel stack pointer).  This way, when a thread is to be context-switched to, the current thread switches to the new thread's stack, and then restores the context.

Stacks grow downward, from numerically higher addresses to numerically lower addresses.

1.3. User Processes

User processes have a kernel stack, for calls within the kernel when the kernel is running on the process's behalf, and a user stack, for local variables while running user-level code.

To prepare a user process to be run for the first time, GeekOS pushes the same state on the kernel stack that it would have had, if it has been previously running and interrupted in a system call or by being preempted. The state pushed onto the kernel stack includes the following:

  1. Context Information: this includes (almost) all the registers used by the user (GS, FS, ES, DS, EBP, EDI, ESI, EDX, ECX, EBX, EAX)
  2. Error code and Interrupt number.
  3. Program counter: this contains the value that should be loaded into the instruction pointer register (EIP). Initially, when the user process is about to run, GeekOS pushes the entry point for the process and this value will be subsequently loaded into EIP.
  4. Text selector: this is the selector corresponding to the code segment (CS) of the process.
  5. The EFlags register.
  6. User stack data selector and user stack pointer: these point to the location of the user stack

(For more information about selectors, see the appendix to project 1.)

When the thread is scheduled for the first time, these initial values will be loaded into the corresponding processor registers and the thread can run. The initial stack state for a user thread is described in the following figure (check Setup_User_Thread() in src/geekos/kthread.c):

User Stack Data Selector (data selector)

User Stack Location

User Stack Pointer (to end of user's data segment)

Eflags

Interrupt_State

Text Selector (code selector)

Program Counter (entry addr)

Error Code (0)

Interrupt Number (0)

EAX (0)

EBX (0)

ECX (0)

EDX (0)

ESI (Argument Block address)

EDI (0)

EBP (0)

DS (data selector)

ES (data selector)

FS (data selector)

GS (data selector)

The items at the top of this diagram (in high memory) are pushed first, the items at the bottom (in lower memory) are pushed last (i.e., the stack grows downward). In this figure, the contents of the stack, not including the user stack location, are defined in struct Interrupt_State in geekos/int.h. This is the structure you're familiar with from modifying system calls in syscall.c.

1.4. The User Stack

The user stack selector is the same as the data selector: both the stack and the data segment occupy the same memory segment. The user stack starts at the high end of the data segment and grows downward. Initially, the user stack pointer should indicate an empty stack. So it points to the end of the data segment.

When switching from kernel mode to user mode, the kernel calls Switch_To_User_Context() in src/geekos/user.c. Switching the context includes the following steps:

2. Project Requirements

This project will require you to make changes to several files. In general, look for the calls to the TODO() macro. These are places where you will need to add code, and they will generally contain a comment giving you some hints on how to proceed. There are three primary goals of this project:

2.1. Signals

In this project, you must implement signal handling and delivery for the following four signals (defined in include/geekos/signal.h):

SIGKILL:
This is is the signal sent to a process to kill it. You should write the handler for this signal such that it results in the same behavior as in project 1's Sys_Kill. The process is not permitted to install a signal handler for SIGKILL.
SIGUSR1, SIGUSR2:
"User-defined" signals with no pre-determined purpose. These will be sent only by other processes.
SIGCHLD:
When a child process dies, if its parent is not already blocked Wait()ing for it, a SIGCHLD signal should be sent to the parent process. For this project, a "background" process must keep its parent (owner points where it belongs and the initial refCount should be 2). When a background process dies, the parent can be informed of this fact by SIGCHLD, and thus can reap the child, using the Sys_WaitNoPID system call, defined below.

Further Reading: More information about signal handling can be found in Chapter 4 of the text.  A nice tutorial on UNIX signals can be found here.

2.2. System Calls

In this project, you will implement five system calls; the user-space portion of these calls is defined for you in src/libc/signal.c:

Sys_Signal:
This system call handler registers a signal handler for a given signal number. The signal handler is a function that takes the signal number as an argument (it may not be useful to it), processes the signal in some way, then returns nothing (void). If called with SIGKILL, return an error (EINVALID). The handler may be set as the pre-defined "SIG_DFL" or "SIG_IGN" handlers. SIG_IGN tells the kernel that the process wants to ignore the signal (it need not be delivered). SIG_DFL tells the kernel to revert to its default behavior, which is to terminate the process on KILL, USR1, and USR2, and to discard (ignore) on SIGCHLD. A process may need to set SIG_DFL after setting the handler to something else.
Sys_RegDeliver:
This system call handler is invoked by Sig_Init, which is invoked by the _Entry function (in src/libc/entry.c), which is invoked prior to running the user program's main(). The system call passes the (user-side) ReturnSignal syscall (the so-called trampoline) to the kernel.
Sys_Kill:
In project 1, this system call handler took as its argument the PID of a process to kill. In this project, it will be used to send a signal to a certain process. So in addition to the PID, Sys_Kill will take a signal number: one of the four defined above. It should be implemented as setting a flag in the process to which the signal is to be delivered, so that when the given process is about to start executing in user space again, rather than returning to where it left off, it will execute the appropriate signal handler instead.
Sys_ReturnSignal:
This system call handler is not invoked by user-space programs directly, but rather is invoked by some stub code that the user process is made to execute at the completion of a signal handler. That is, before the kernel allows a user process to execute a signal handler, it arranges the stack so that the user process will "return" to this stub code (which invokes ReturnSignal).
Sys_WaitNoPID:
This system call handler is not part of the signalling system, but rather is used in the SIGCHLD signal handler. Recall that the Sys_Wait system call handler takes as its argument the PID of the child process to wait for, and returns when that process dies. The Sys_WaitNoPID handler, in contrast, takes a pointer to an integer as its argument, and reaps any zombie child process that happens to have terminated. It places the exit status in the memory location the argument points to and returns the pid of the zombie. If there are no dead child process, then the system call returns ENOZOMBIES.

Termination

If the default handler is invoked for SIGKILL, SIGUSR1, or SIGUSR2, Print("Terminated %d.\n", g_currentThread->pid); and invoke Exit.

Reentrancy and Preemption

Sending a signal corresponds to setting a "pending signal" flag in the user context object of the process; the signal handler need not be executed immediately. In particular, if the process is executing a signal handler, do not start executing another signal handler. Further, multiple invocations of kill() to send the same signal to the same process before it begins handling even one will have the same effect as just one invocation of kill(). For example, if two children finish while another handler is executing (and blocked), the SIGCHLD handler will be called only once. However, if one child finishes while the parent's SIGCHLD handler is executing, another SIGCHLD handler should be called. See the sigaction() man page if in doubt about reentrancy. The delivery order of pending signals is not specified. (They need not be delivered in the order received.)

2.3. Signal Delivery

To implement signal delivery, you will need to implement (at least) five routines in src/geekos/signal.c:

Send_Signal:
This takes as its arguments a pointer to the kernel thread to which to deliver the signal, and the signal number to deliver. This should set a flag in the given thread to indicate that a signal is pending. This flag is used by Check_Pending_Signal, described next.

Check_Pending_Signal:
This routine is called by code in lowlevel.asm when a kernel thread is about to be context-switched in. It returns true if the following THREE conditions hold:
  1. A signal is pending for that user process.
  2. The process is about to start executing in user space. This can be determined by checking the Interrupt_State's CS register: if it is not the kernel's CS register (see include/geekos/defs.h), then the process is about to return to user space.
  3. The process is not currently handling another signal (recall that signal handling is non-reentrant).

Set_Handler:
Use this routine to register a signal handler provided by the Sys_Signal system call handler.

Setup_Frame:
This routine is called when Check_Pending_Signal returns true, to set up a user process's user stack and kernel stack so that when it starts executing, it will execute the correct signal handler, and when that handler completes, the process will invoke the ReturnSignal system call to go back to what it was doing. IF instead the process is relying on SIG_IGN or SIG_DFL, handle the signal within the kernel. IF the process has defined a signal handler for this signal, function Setup_Frame will have to do the following:
  1. Choose the correct handler to invoke.
  2. Acquire the pointer to the top of the user stack. This is below the saved interrupt state stored on the kernel stack, as shown in the figure above.
  3. Push onto the user stack a snapshot of the interrupt state that is currently stored at the top of the kernel stack. The interrupt state is the topmost portion of the kernel stack, defined in include/geekos/int.h in struct Interrupt_State, shown above.
  4. Push onto the user stack the number of the signal being delivered.
  5. Push onto the user stack the address of ReturnSignal (the "signal trampoline" that invokes the Sys_ReturnSignal system call handler), which was registered by the Sys_RegDeliver system call handler, mentioned above.
  6. Change the current kernel stack such that (notice that you already saved a copy in the user stack)
    1. The user stack pointer is updated to reflect the changes made in steps 3--5.
    2. The saved program counter (eip) points to the signal handler.

Complete_Handler:
This routine should be called when the Sys_ReturnSignal call handler is invoked (after a signal handler has completed). It needs to restore back on the top of the kernel stack the snapshot of the interrupt state currently on the top of the user stack.

Notes