CMSC 412 Project #3

Scheduling and Synchronization

Due Thursday, October 30th, at 11:59pm

Overview

This project has two parts:

  1. Scheduling. As you might have already noticed, GeekOS uses a priority-based, preemptive Round Robin algorithm. In this project, you will add your own scheduling algrorithm from your homework. To see the difference between these algorithms, you will also implement a means to check the system's current time.
  2. Synchronization Implementation.  You will implement semaphores, a simple synchronization primitive. In addition, you will provide user programs with semaphore-manipulating system calls.

Scheduling

Adding a new scheduler.  You will augment the existing GeekOS Round Robin scheduling algorithm with your scheduling algorithm from the homework.

Your operating system should be able to select which scheduling algorithm is being used via a system call: the system call int Set_Scheduling_Policy(int policy) should be implemented for this purpose. If the value of policy is 0, the system should switch to round robin scheduling, if the policy is 1, the system should switch to your scheduler. Other values of this parameter should result in an error code being returned (i.e. a -1 return value).

When the system boots up, it will have to use your scheduler. So don't forget to set your scheduler as the initial scheduler.

The choice of which scheduler to use should be made within the function Get_Next_Runnable() (see kthread.c). Any function that calls Get_Next_Runnable() should be unaware of which scheduling algorithm is being used (i.e., do not try to pass the scheduling type as an argument). It should only be aware that some thread has been selected.

Get System Time.  One way to compare scheduling algorithms is to see how long it takes a process to complete from the time of creation to the termination of the process. You will investigate these differences by implementing the Get_Time_Of_Day() syscall.

Get_Time_Of_Day() will return the value of the kernel global variable g_numTicks. The variable is already implemented in the kernel (see timer.h), you only need to implement the system call to read it. You can use this system call in a user program to determine how much time has elapsed while the program was running. You can do this by calling Get_Time_Of_Day() once at the beginning of the process (in the user code) and once at the end. You can calculate how long the process took to run, as well as when the process first got scheduled (based on ticks). Notice that there is no attempt to remove time spent by other processes. For example, if your process context switches out, then a second process runs, the second process's time during the context switch will be included in the first process's elapsed time. This is known as "wall clock" time. One can also just calculate the time used by the process itself. This is called "process time" (or sometimes virtual time) but this project is not concerned with that.

Additional Reading.  Scheduling is covered in Chapter 5 of the text.  Multi-level feedback scheduling is covered in Section 5.3.6, pps. 168-169.

Semaphores

You will add system calls that provide user programs with semaphores, to enable thread synchronization among different threads. The system calls (on the user side) will be:

int Create_Semaphore(const char *name, int ival)
int P(int sem)
int V(int sem)
int Destroy_Semaphore(int sem)

Create_Semaphore

Create_Semaphore(name, ival) is a request by the current process to use a semaphore.  The user gives a name for the semaphore, as well as the semaphore’s initial value, and will get back a semaphore ID, an integer between 0 and N - 1. The semaphore ID denotes a particular semaphore datastructure in the kernel, which you must implement.  The semaphore ID is then passed by the user program to the operations P() and V(), described next, to wait or signal the associated semaphore. 

Your operating system should be able to handle at least 20 (thus N = 20) semaphores whose names may be up to 25 characters long. If there are no semaphores left (i.e., there were N semaphores with unique names already given), -1 must be returned indicating an error.

The returned semaphore ID is chosen in one of two ways.

  1. If this is the first time Create_Semaphore has been called for the given name, the kernel should find and return an unused SID, and initialize the value of the associated semaphore datastructure to ival.
  2. If another thread has made this system call with the same name, and the semaphore has not been destroyed in the meantime (see below), you must return back the same semaphore ID (sem) that was returned the first time. The parameter ival is ignored in this case.
Think of a semaphore ID as like a file descriptor in UNIX: in that case, when you open a file, you get back a number (the file descriptor) that denotes that file.  Subsequent read and write operations take that file descriptor as an argument, and the kernel figures out which file the number is associated with, and then performs the operations on that file.  Just the same way, you will implement a semaphore datastructures within the kernel, and refer to them from user programs via their associated semaphore IDs.

P and V

The P(sem) system call is used to decrement the value of the semaphore associated with sempahore ID sem.  This operation is referred to as wait() in the text.  Similarly, the V(sem) system call is used to increment (signal() in the text) the value of the semaphore associated with sem.

As you know, when P() is invoked using a semaphore ID whose associated semaphore's count is less than or equal to 0, the invoking process should block.  To block a thread, you can use the Wait function in the kernel. Each semaphore data structure will contain a thread queue for its blocked threads.  The file thrqueue.h provides an implementation of a thread queue. You should look at kthread.h and kthread.c to see how it is declared and used. To wake up one thread/all threads waiting on a given semaphore, i.e. because of a V(), you can use Wake_Up_One()/Wake_Up() routines from kthread.h.

A process may only legally invoke P(sem) or V(sem) if sem was returned by a call to Create_Semaphore for that process (and the semaphore has not been subsequently destroyed).  If this is not the case, these routines should return back -1.

Destroy_Semaphore

Destroy_Semaphore(sem) should be called when a process is done using a semaphore; subsequent calls to P(sem) and V(sem) (and additional calls to Destroy_Semaphore(sem) by this process) will return -1.

Once all processes using the semaphore associated with a given semaphore ID have called Destroy_Semaphore, the kernel datastructure for that semaphore can be destroyed.  A simple way to keep track of when this should happen is to use a reference count.  In particular, each semaphore datastructure can contain a count field, and each time a new process calls Create_Semaphore, the count is incremented.  When Destroy_Semaphore is called, the count is decremented.  When the count reaches 0, the semaphore can be destroyed.  Note that when a thread exits, the kernel should automatically call Destroy_Semaphore() on behalf of this thread, for all the semaphores it has in its list.

Notes

In order not to clobber syscall.c with too much functionality, you might want to put your semaphore implementation in two new files sem.h and sem.c.  Semaphore operations need to be implemented within a critical section, so that operations execute atomically.

Since you need to have multiple processes running concurrently to test the functionality you will implement, your shell should be able to launch processes in background.  You can do as you did in either project 1 or 2.

In this, and other projects, you will rely heavily upon a list data structure. For this reason an implementation has been provided to you in list.h file. Please familiarize yourself with its syntax and functionality. It could be a little tricky to understand the syntax since functions are written using #define. Naturally you are always free to extend, modify, or write your own implementation that would better suit your needs.

Additional Reading.  Semaphores are covered in Chapter 6, Section 6.5, of the text; pps. 202-204 describe implementation issues.

Summary: New System Calls

Identifier Kernel Function User Function Effect
SYS_SETSCHEDULINGPOLICY int Sys_SetSchedulingPolicy(struct Interrupt_State* state) int Set_Scheduling_Policy(int policy) if policy is 0 or 1 change scheduler;
else return -1
SYS_GETTIMEOFDAY unsigned long Sys_GetTimeOfDay(struct Interrupt_State* state) unsigned long Get_Time_Of_Day() return g_numTicks
SYS_CREATESEMAPHORE int Sys_CreateSemaphore(struct Interrupt_State* state) int Create_Semaphore(const char *name, int ival) if name is longer than 25 characters, return -1
if a semaphore with this name doesn't exist, create it and return its SID; if it exists, return its SID; note that SID must be >= 0
SYS_P int Sys_P(struct Interrupt_State* state) int P(int sem) might block
wait() semantics
returns -1 if sem invalid
returns 0 on success
SYS_V int Sys_V(struct Interrupt_State* state) int V(int sem) never blocks
signal() semantics
returns -1 if sem invalid
returns 0 on success
SYS_DESTROYSEMAPHORE int Sys_DestroySemaphore(struct Interrupt_State* state) int Destroy_Semaphore(int sem) never blocks
returns -1 if sem invalid
returns 0 on success

Testing your code

The files we provided can be used to test your semaphores or scheduling algorithm:

Final Notes

We do not require that your earlier projects worked; you should be able to implement this project directly from the base kernel, without using the earlier kernel features.  Perhaps a small exception to this is the convenience of background processes for testing purposes, but this is straightforward.  However, the base kernel that we have provided will have some important system calls and other routines unimplemented.  However, you can "implement" these routines by simply removing the TODO("...") and having them return 0.  This goes for Sys_RegDeliver, Sys_Signal, and Check_Pending_Signal.

If you do choose to merge your kernel with the initial kernel for this project (preferred), here is a trick that may help you; it takes two steps:
  1. finding out the differences between project 2 base and your project 2 implementation
  2. integrating these changes into the project 3 base
 Let's assume in your home directory you have three directories: project2, project2-solution, project3.
  1. So let's find the implementation differences first:
     
    $ cd ~/project2/src/geekos
    $ diff -u -r . ~/project2-solution/src/geekos > ~/diff.patch 
    
  2. And then integrate them into project 3 base:
     
    $ cd ~project3/src/geekos
    $ patch -p0 < ~/diff.patch 
    $ rm *orig
    
You will likely get a few errors when doing this, particularly in kthread.c.  This is because some functions that you changed to get project 2 to work have been changed in the base kernel for this project; for example, this may be the case for functions Make_Runnable and Get_Next_Runnable  For these functions, integrate the code by hand.  You can look at the generated .rej files to see what parts of the merge didn't work out.