CMSC 412 Project 3

Scheduling and Synchronization

Due Friday, October 25 Sunday, October 27, 11:59 pm

Overview

This project has two parts.

CPU Scheduling: As you might have already noticed, the default CPU scheduling algorithm in GeekOS is priority-based preemptive round-robin. You will add your own CPU scheduling algorithm, and a system call to choose between your algorithm and the original round-robin scheduler. To measure the difference between them, you will add a system call to get the system's current time.
Semaphores: You will implement semaphores, a simple synchronization construct for threads. You will provide system calls to create, use and terminate semaphores.

Preparation

Do svn update in your network directory.

Part 1: Scheduling

You will augment GeekOS with your own CPU scheduling algorithm. Set your scheduler as the initial one, i.e., the one being used when the system boots up. You can make this setting in Init_Scheduler in kthread.c.

You will provide the following two system calls so that you can run user programs to compare your scheduler against the default.

Set_Scheduling_Policy (Kernel entry point: Sys_SetSchedulingPolicy)

This system call requests to choose a CPU scheduler. The details of the interface are as follows.

int Set_Scheduling_Policy(int policy, int schedParam)

If parameter policy is 0, the system should switch to its built-in round robin scheduling.
If parameter policy is 1, the system should switch to your scheduler.
For other values of this parameter, return EINVALID.

The second parameter, schedParam, is for use in your scheduling algorithm. You may ignore it or use it any way you wish. For instance, it could hold the number of ticks in a quantum, or the "alpha" parameter used in the text's description of shortest-job-first scheduling.

The second parameter is provided to make it easier for you to tune your algorithm, if it contains a tuneable parameter. However, we will test your code with schedParam set to zero. You should adopt some reasonable default behavior for that case.

Get_Time_Of_Day (Kernel entry point: Sys_GetTimeOfDay)
This system call gets the system's current time. The time is available in the kernel global variable g_numTicks (see timer.h); the system call only needs to read and return it.
unsigned long Get_Time_Of_Day():

Return g_numTicks
You can obtain the elapsed time during a user program's execution by invoking the system call at the beginning and at the end of the program (in the user code). Use such measurements to compare your scheduler versus the default scheduler; include them in your justification in README.scheduler.
Note: This elapsed time is known as wall clock time. It includes the time when the system is not doing anything on behalf of the program's process, e.g., when the process is switched out but runnable, when the process is waiting on an IO device that is serving another process. The time during which the process is executing on the CPU is referred to as process time (or sometimes virtual time); this project is not concerned with obtaining that.

The choice of scheduler should be implemented within the function Get_Next_Runnable() in kthread.c. Do not change the prototype of this function, as it is called from assembly code. That is, you should not try to pass the choice of scheduling algorithm to Get_Next_Runnable, or return anything except for the thread it selects. Instead, the function should refer to a flag (set when the user calls Set_Scheduling_Policy).

You are free to create a new scheduling algorithm that is optimized for anything you want. However, there are two constraints. First, you need to explain its goals, i.e., what it is intended to optimize. When you turn in your code, put your explanation (justified with measured elasped times) in a README.scheduler file in the top level of the class project directory. Second, make sure that your algorithm is no worse than the default algorithm in the time it takes to complete user programs. Otherwise the tester may time out and give up on your submission.

Additional Reading

Scheduling is covered in Chapter 5 or 6 of the text, depending on the edition that you are using.

Part 2: Semaphores

You will augment GeekOS with semaphores so that user programs can open them, use them for thread synchronization, and close them.

For specification purposes, it is convenient to view each semaphore as having the following attributes while it exists. ("For specification purposes" means that these attributes need not be explicitly present in your implementation.)

name: at most 25 characters.
semaphore id (or SID): integer.
value: non-negative integer.
open users: set of user threads that currently have the semaphore "open". Note that this is not the same thing as the queue of threads that are waiting for the semaphore.

The name is supplied by user programs. The SID is supplied by the kernel; it points to a particular semaphore data structure (which you will implement) in the kernel. The SIDs come from a fixed pool of 20 SIDs, initially all available. SIDs are somewhat analogous to file descriptors, in that they are given to user processes as "handles" to kernel data structures. They are different, though, in that SIDs are shared across all processes, while each process has its own set of file descriptors.

You will add the following four system calls to manipulate semaphores.

Open_Semaphore(name,ival)

This is a request by the executing thread to open the semaphore with the given name and initial value. If a semaphore of that name does not currently exist, a semaphore is created with initial value ival and an available SID, and the SID is returned. If a semaphore with that name currently exists, its SID is returned. In both cases, the caller now has the semaphore open. A more complete interface of the system call follows:

int Open_Semaphore(const char *name, int ival):

If a semaphore of the given name does not currently exist and there is an available SID, the kernel should return an available SID and initialize the value of the associated semaphore data structure to ival. The caller now has the semaphore open. The SID is now not available.

If a semaphore of the given name does not currently exist and there is no available SID, the kernel should return ENOSPACE. No semaphore is created.

If a semaphore with the given name currently exists, the kernel should return the SID of the semaphore. The caller now has the semaphore open. (The ival is ignored.)

If name is longer than 25 characters, the kernel should return ENAMETOOLONG. No semaphore is created.

Close_Semaphore(sem)

This is a request to close the semaphore (with SID) sem. The thread no longer has the semaphore open. (So this should be called when the thread is done using the semaphore.) Further, if the semaphore has no open users, it is terminated (its kernel data structure can be destroyed) and its SID becomes available. A more complete interface of the system call follows:

int Close_Semaphore(int sem):

Returns EINVALID if sem is not an SID or if the caller does not have semaphore (with SID) sem open. (The latter also covers the case where semaphore sem does not currently exist.)

Otherwise returns 0. The caller no longer has the semaphore open. If the semaphore now has no open users, it is terminated and SID sem becomes available.

P(sem)

This system call performs the standard "P" (or "wait") operation on the semaphore with SID sem.

int P(int sem):

Returns EINVALID if sem is not an SID or if the caller does not have the semaphore sem open.

If the caller has the semaphore open:

Returns 0 only when the semaphore's value is greater than zero and simultaneously decreases the semaphore's value by 1.
Blocks otherwise.

V(sem)

This system call performs the standard "V" (or "signal") operation on the semaphore with SID sem.

int V(int sem):

Returns EINVALID if sem is not an SID or if the caller does not have the semaphore with SID sem open.
If the caller has the semaphore open: returns 0 and simultaneously increases the semaphore's value by 1.

Each semaphore data structure should contain a thread queue for its blocked threads. To block a thread (in a "P" operation), you can use the Wait function in the kernel. To wake up a thread (in a "V" operation), you can use the Wake_Up_One function. (If you want to wake up all the threads in a thread queue, you can use the Wake_Up function. Look at kthread.h and kthread.c to see how these functions are declared and used.

You have some options regarding the implementation of V(sem) when processes are blocked on semaphore sem. One option is to select one or more blocked processes and restore each to redo the P(sem) in which it was blocked. Another option is to select one (and only one) blocked process and restore it to just after the P(sem) in which it was blocked. (Here, the V operation is also simultaneously executing the blocked thread's P operation. Think about how the semaphore's value should be updated.)

Thread termination

If a thread exits while it has one or more semaphores open, the kernel should close this thread from every such semaphore. (So it makes sense to have a semaphore-closing function in the kernel which can be invoked by both the Sys_Close_Semaphore() function and some function involved in terminating user threads.)

Notes

In order not to clobber syscall.c with too much functionality, put your semaphore implementation in sem.h and sem.c. Semaphore operations must be executed atomically, i.e., with interrupts disabled.

Since you need to have multiple processes running concurrently to test the functionality you will implement, your shell should be able to launch processes in the background.

As in previous projects, you will rely heavily upon the list data structures provided in the file list.h. There is some background material on these structures in the slides for Project 1. Naturally you are always free to extend, modify, or write your own implementation that would better suit your needs.

Additional Reading

Semaphores are covered in Chapter 6 or 7 of the text, depending on which edition you are using.

Summary: New System Calls

Identifier	Kernel Function	User Function
SYS_SETSCHEDULINGPOLICY	int Sys_SetSchedulingPolicy (struct Interrupt_State *state)	int Set_Scheduling_Policy (int policy)
SYS_GETTIMEOFDAY	unsigned long Sys_GetTimeOfDay (struct Interrupt_State *state)	unsigned long Get_Time_Of_Day()
SYS_OPEN_SEMAPHORE	int Sys_Open_Semaphore (struct Interrupt_State *state)	int Open_Semaphore (const char *name, int ival)
SYS_P	int Sys_P (struct Interrupt_State *state)	int P (int sem)
SYS_V	int Sys_V (struct Interrupt_State *state)	int V (int sem)
SYS_CLOSE_SEMAPHORE	int Sys_Close_Semaphore (struct Interrupt_State *state)	int Close_Semaphore (int sem)

Testing your code

The following files in src/user can be used to test your semaphores and scheduling algorithm:

semtest.c and semtest2.c perform basic semaphore wait and signal operations, but do not attempt to synchronize multiple processes.
semtest1.c runs sem-p1 and sem-p2, which simulate a producer-consumer computation, along with sem-p3, which waits to execute until sem-p2 signals it on a semaphore.
schedtst.c runs sched1, sched2, and sched3, which do some synchronization with semaphores and also occasionally print numbers to the screen to indicate how they are progressing. schedtst takes one argument, which is either "rr" to use the round-robin scheduler or "mys" to use your scheduling algorithm.
workload.c runs sem-long, sem-ping, and sem-pong, which, again, run alongside one another with some synchronization from semaphores. Each child process prints the total time it took to run, and workload prints the total elapsed time for all three processes to complete. Like schedtst, workload takes the argument "rr" or "mys" to determine what scheduling algorithm to use.
Finally, time.c prints the result of Get_Time_Of_Day, for your reference.