CMSC 412

Project 5: Interprocess Communication and Command Interpreter

Due: May 13, 1996, (5:00 P.M. to Dr. Hollingsworth)

Introduction

Except for the Cprintf(), the programs that you have written have done very little I/O, and yet I/O remains a key activity of many user programs. Because I/O services are shared, it, too, falls under the management of the operating system. In this project, which will build on top of project 4, you will build a message passing library which will allow processes to communicate with one another. You will also implement pipes using this library. In addition, you will write a very simple shell which will parse the command line and start up processes.

To enable this effort, we have provided the following new files:

io412.c: This file is now to be compiled and linked only with user applications. It contains a modification of the Cprintf() function.
kprintf.c: This file is now to be compiled and linked in with the kernel. This is the version Cprintf() which the kernel will use. You will want to link this file into your kernel. You should not link io412 or klib into your kernel. NOTE: If you didn't call your screen driver character output function, kputch, you will need to rename it that.

We have written several applications which you will be able to run from your sample shell. For all of these programs, EOF is indicated by pressing the ESC key.

wc.c, wc.mak: This program is a simple version of the word count command. It counts the number of characters, words, and lines in a file. It takes the normal UNIX arguments of -lwc.
cat.c, cat.mak: This program is similar to the UNIX command cat. It reads input from stdin and writes it to stdout.
tee.c, tee.mak: This program is similar to the UNIX command tee. It reads input from stdin and write a copy to stdout and a second copy to stderr.

Message Passing

Message passing allows processes to communicate by sending messages. In this project, those messages will be sent via message queues. You can think of the message queues as mailboxes. A "send" to a mailbox will add a message to the queue of messages corresponding to that mailbox. A "receive" takes one of the message and returns it to the receiver. Because the message queues are queues, a send should attach a message to the end of a queue of messages. A receive should dequeue the message from the front of a queue. The message queue is a queue of arrays of bytes. The kernel must allocate space to hold each message and copy it from the process sending the message. On a receive, the kernel will dequeue the first message in the queue, copy the message into the buffer passed by the receiver, and then free up the memory used by the message. If the message is longer that the buffer size of the receiver, part given to the receiver, while the rest will form a new message to be placed at the head of the message queue.

The File Descriptors

Each process (in its pcb) will have a file descriptor array. The array will contain mailbox numbers (i.e., indices into the MQ array described bellow). It will have a value of -1 if the file descriptor is being unused. You may also assume this array has a limited size, say, 20. The indices of the file descriptor array are the file descriptors. As in UNIX, file descriptor 0, 1, and 2 are reserved for stdin, stdout, and stderr, respectively. Hence, if you want to send a message to stdout, you will use file descriptor 2.

You should implement the following system calls for creating and using file descriptors. These will, as in previous projects, be simple functions which call gen_interrupt() and set AX and related registers correctly. Due to the behavior of the Borland C compiler, make sure _AX gets set just before you generate the interrupt.

int MQ_Create(char *name):

This will allow you to "create" a message queue from which to send messages. If there is already a mailbox with this given name, then find a unused file descriptor from the process's file descriptor table (array), and assign the mailbox number to this file descriptor (element). Return the file descriptor as the result of this system call. If this is the first call to MQ_Create with the given name, it creates a new entry in the MQ array. Return -1 if all mailboxes are currently in use (i.e., all mailboxes have at least one user), or all file descriptors are currently in use.

int MQ_Send(int fd, void *buffer, int size):

Once you have a file descriptor from (MQ_Create or fd's 0-2), you can send a message to the message queue associated with the file descriptor. fd will be the file descriptor, buffer will be a in the user process, and size is the length of the message. If you are sending a string, strlen()can be used to get the length of the message. This system call can not block the user process. Inside the kernel, your implemention of send will need to allocate a buffer for the message to be sent, and copy the message into that buffer.

int MQ_Receive(int fd, void *buffer, int maxSize):

Processes can receive messages from a message queue using this call. fd is a file descriptor returned by MQ_Create. buffer is where the message will be received. maxSize is the maximum size of the message that can be received. The buffer must already be allocated by the user and should be able to hold at least maxSize characters. If there are no messages in the message queue, then the calling process will block. There should be a blocked queue for each active message queue where PCB's can be placed if blocked. If the size of the message being received exceeds maxSize, then only maxSize bytes should be read. The rest of the message should be placed at the head of the queue as its own message. For example, if maxSize were 10, and the message contained 20 characters, then 10 characters would be written to the buffer, and the remaining 10 characters would be at the head of the queue. The function will return the number of characters read from the message queue. A process will block if there are no messages in the mailbox. A -1 is returned for any error conditions (invalid file descriptor, for example).
- int MQ_Destroy(int fd):
Frees up use of a file descriptor by the current process, and decrement the use count of the message queue. If the use count for the message queue is 0, then any pending messages are freed up, and the message queue is deleted. File descriptors can be reused.

For each of the functions listed above, you should have a kernel version of the functions. You should name these functions KMQ_Send(), KMQ_Receive(), and so forth.

The MQ array

This data structure is an array of mailboxes. Each mailbox will contain a message queue, the name of the mailbox (determined by MQ_Create), and the number of processes (the user count) that are using this queue. The MQ array should contain at least 20 elements. Two of these will be reserved. They will contain the names "\dev\console" and "\dev\keyboard". For example, mailbox 0 can be used for the console message queue, and mailbox 1 can be used for the keyboard. The user count for these mailboxes should never reach 0. I.e., they should never be reallocated to new names.

There is a subtlety in implementing "\dev\console" and "\dev\keyboard". If you MQ_Send a message to the console, you should write characters to the screen using kputch() rather than appending a message to the message queue. kputch() is the kernel version of Put_char(). This must be the name you use as Cprintf() in kprintf.c depends on it. Also, your keyboard interrupt routine should be modified to enqueue characters read from the keyboard into the MQ mailbox for "/dev/keyboard". Each character will be its own message.

You may wish to use the KMQ_Create(), the kernel version of MQ_Create() to create these two mailboxes during the initialization of the OS.

The array is indexed by mailbox numbers, not file descriptors.

For the init process, 0 will refer to the mailbox associated with "\dev\keyboard" while 1 and 2 will refer to the mailbox associated with "\dev\console". If 0 is the mailbox number for "\dev\keyboard" and 1 is the mailbox number for "\dev\console", and if fd is the name of the file descriptor array, then by default, fd[0] will contain 0, fd[1] and fd[2] will contain 1.

Proc_start() modified

  int Proc_start( void *fp, int argc, char **argv, 
                    int stdin, int stdout, int stderr );

Proc_start() will now take 6 arguments, instead of 3. The first three arguments are the same as before. They should contain a function pointer, argc, and argv. The last three arguments are file descriptors (not mailbox numbers). The fourth argument is the file descriptor for stdin, the fifth argument is the file descriptor for stdout, and the sixth argument is the file descriptor for stderr. When a new process is created, file descriptor 0, 1, and 2 for the new process will contain mailbox numbers associated with the file descriptors that were passed in. Hence, if 4, 7, and 2 are the file descriptors passed to Proc_start as the fourth through sixth arguments respectively, fd[ 0 ] of the child process will be the mailbox bound to fd[ 4 ] of the parent process, fd[ 1 ] will contain fd[ 7 ] of the parent process, and fd[ 2 ] will contain fd[ 2 ] of the parent process.

This change will allow you to implement pipes where the stdout (fd 1) of one process can be hooked up to the stdin (fd 0) of another process.

The file descriptors 0, 1, and 2 do not have to be created, or destroyed by the application process. However, to reflect an accurate user count, you should increment and decrement the values in the MQ array during process creation and termination.

Cprintf() modified

Cprintf() has been modified to use MQ_Send(). Because MQ_Send() is a system call executed by the kernel, and because operations in the kernel run atomically (due to the kernel modes), you do not need to put P's and V's around Cprintf's in this project. Cprintf() prints to stdout, which can now be redirected.

Implementing Waitpid()

WaitPid() is a system call with the following prototype

    int Waitpid( int pid );

A process calling Waitpid() will block until the termination of the process with process ID number, pid. This will be useful in implementing the shell. Each process will have a blocked queue for processes that are waiting for it to terminate. Upon termination of the process, any processes blocked on a Waitpid() call will be woken up. If no processes have the PID value passed in, then the Waitpid() should return -1 immediately and not wait.

Writing a Simple Shell

Normally, users execute commands using a shell. This is a common idea in most operating systems. The shell is a user process whose purpose is to read in commands, parse the input, then execute the call. For the most part, this will mean calling Proc_start() on the parsed command line.

Your initial process, init, should load the shell module (called shell.mod) and Proc_start() it. The shell should display a prompt using Cprintf(). The shell will read in commands using MQ_Receive. It will then parse the command line, and call Proc_start() if necessary. It will then block using Waitpid() on the PID returned byProc_start().

To illustrate the operations of the shell, we will start with a simple example.

   os412 %   wc.mod

os412% is the prompt from the shell. The user (you) type in wc.mod. The shell will then discover determine that argc is 1, and create an appropriate argv array. It will try to open the file named wc.mod using Load_module(). It this succeeds, then the function pointer for Proc(), argc, and argv will be passed to Proc_start, as well as the default file descriptors. The call should look like:

     Proc_start( fp, argc, argv, 0, 1, 2 );

The shell will use "\dev\keyboard" for its stdin, and "\dev\console" as stdout and stderr. These will be bound to fd's 0, 1, and 2 by init. The new process, by default, "inherits" these values from shell.

You should also be able to parse out additional arguments. For example, if the call were

     wc.mod -f foo

then argc should be 3, and argv should contain "wc.mod", "-f" and "foo".

A newline character will signify the end of the command, and should cause the shell to start parsing. You do not have to implement a backslash continuation.

Handling Pipes

A pipe is a mechanism that allows the standard output of one process to be connect to the standard input of another process. A pipe will be denoted by the vertical bar, "|", as it is in UNIX. Hence, if you get the following in your command line:

    os412 %   yay.mod | bar.mod foo

Then, you should Proc_start two processes. Each process will have its own argc and argv. The first process will be "yay.mod" and the second "bar.mod" with an argument for "foo". Make sure both modules can be loaded, otherwise do not complete the pipe, print an error message, and display the prompt again.

If both modules can be loaded, you will need to create a pipe. A pipe is basically a message queue shared by the two processes. Recall that the stdout of the yay.mod process should be hooked to the stdin of bar.mod process. Think of how this can be done using a message queue and appropriate Proc_start's. The shell will then block on the PID values of these two processes. Once completed, a prompt will be displayed.

You should be able to handle multiple pipes.

Exiting

For most commands, you will parse it, and try to load the module. However, you will need a way to quit from the shell. Typing "exit" as a command should cause you to exit the shell (i.e., do not read any more commands from the user). After typing it, there will be no user processes, which means you should return back to main() and exit to DOS as in previous projects.

A More Realistic Shell

Your shell will not have some of the features that you normally associate with shells. You will not be able to redirect output to a file, nor read input from a file. You won't be able to place processes in the background or bring it into the foreground. Because control keys were not implemented in the keyboard driver, we will not be able to implement such standards as the control-C feature (interrupt the current process) nor the control-Z feature (place the process in the background). You will not handle globbing (dealing with wildcards) nor have histories. It is useful, however, to think about how this might all be implemented in this project. You might want to spend ten or twenty minutes thinking about how you would implement these more advanced features. Some of this would not be too difficult. However, you will not be required to implement any of these advanced features (but they make great final questions).

Modifications to Existing Code

Ksleep and Kwakeup

Because you will be using many queues to put processes to sleep, and because Ksleep and Kwakeup currently take only one argument (the semaphore ID), it may become unwieldly to partition the semaphore IDs for so many different uses. Hence, you may wish to take a second argument for both Ksleep and Kwakeup. For example, if the second argument is 0, then you could use the semaphore queues. If the second argument is 1, you might refer to queues relating to Waitpid() system calls, and so forth. Since we will be testing your code by using your shell, there will be no need for our test programs to be aware of how Ksleep and Kwakeup are implemented.

Put_char and Get_char

You will no longer have these functions in klib (MQ_Send and MQ_Receive subsume their roles). You should also remove the cases in System_service for these routines. You will want to keep the kputch function, but it will only be called by KMQ_Send (for the mailbox "/dev/console") and Cprintf (in kprintf.c). Likewise the queue of characters from the keyboard has been replaced by the mailbox for "/dev/keyboard".