Due Wednesday, May 9th, 2007 (6:00 PM)
The purpose of this project is to add a new filesystem called to GeekOS called GOSFS (for GeekOS FileSystem). GOSFS will support multiple directories and long file names, as well as the standard operations for file management. The new filesystem will reside on the second IDE disk drive in the Bochs emulator. This will allow you to continue to use your existing PFAT drive to load user programs while you test your filesystem. The second IDE disk's image is called diskd.img. It is 10MB by default.
Since GeekOS will have two types of filesystems (PFAT and GOSFS), it will have a virtual filesystem layer (VFS) to handle sending requests to an appropriate filesystem (see figure below). We have provided an implementation of the VFS layer in the file vfs.c. The VFS layer will call the appropriate GOSFS routines when a file operation refers a file in the GOSFS filesystem.
The implementation of PFAT is in pfat.c. You will
implement GOSFS in gosfs.c and relevant system calls in syscall.c.
VFS picks the functions
to call based on supplied structures containing function pointers, one
for each operation. For example, see Init_PFAT in pfat.c: this initializes the
PFAT filesystem with VFS by passing it a pointer to s_pfatFilesystemOps, a
structure that contains function pointers to PFAT routines for mounting
(and formatting) a filesystem. Other PFAT functions are stored in
different structures (e.g., look at the PFAT_Open routine, which passes
structure to VFS). You will analogously use s_gosfsFilesystemOps,
s_gosfsMountPointOps, s_gosfsDirOps, and s_gosfsFileOps
You should also add a call to Init_GOSFS provided gosfs.c
in main.c to register the GOSFS filesystem. In general,
use the PFAT implementation as your guide to interfacing GOSFS with VFS.
Open files are tracked by the kernel using a file descriptor, defined as struct File. Each user
space process will have an associated file descriptor table that records which files the
process can currently read and write. A user process can have up to
USER_MAX_FILES files open at once. The file descriptor table
is implemented as a struct File *fileList[USER_MAX_FILES]
array that you must add to struct User_Context. Note that
not all the entries in the fileList are necessarily open
files, since usually a process has less than USER_MAX_FILES
files open at once. If fileList[i] is NULL, it represents a free
slot (file descriptor is not used). This descriptor will be filled
out by the code in VFS; e.g., see Open in vfs.h,
whose pFile argument is
a pointer to a free slot in the table.
Your filesystem should support long filenames (at most 128 bytes, which includes a null at the end). A file's full path (which includes directories and subdirectories) will be no more than 1024 characters in total.
You should keep track of free disk blocks using a bit vector (as described in the textbook, Section 11.5.1, page 429). A library called bitset is provided (see bitset.h and bitset.c) that manages a set of bits and provides functions to find bits that are 0, which correspond to free disk blocks (note that this is the opposite of the book, which uses a 1 to indicate a free block).
The block size for GOSFS is 4 KB; IDE sectors (a.k.a. disk blocks) are 512 bytes, so a block is 8 sectors. Thus one bit in a bitset corresponds to a 4KB block. For example, a bitset that is 8192 bits (1024 bytes) will keep track of 8192 * 4KB = 32 MB of data.
The filenode for a directory (i.e., a subdirectory) is represented by the GOSFS_DIRENTRY_ISDIRECTORY flag set in the GOSFS_Dir_Entry->flags field. As we shall see later, a filenode for a file stores information about the blocks that the file is composed of. The filenode for a directory on the other hand just stores the location of the one block and contains all the information about that directory. The location of the block that holds the data for the directory will be stored in the first entry in the blocksList array of the directory's filenode (hence entries blocksList..blocksList[GOSFS_NUM_BLOCK_PTRS-1] are unused).Unlike directories, which have a fixed size (one block, irrespective of how many files they hold), files can take up an arbitrary number of disk blocks. You will use a version of indexed allocation to represent the data blocks of your filesystem. The blocksList field in the GOSF_Dir_Entry struct in the gosfs.h file keeps track of data blocks for a file. The first eight entries are direct blocks (i.e., direct pointers to the blocks of the file), the ninth points to a single indirect block, the tenth to a double indirect block (there is no triple indirect block). This is similar to the "Combined scheme" as detailed in Section 11.4.3 in the text, pages 427 - 428.
The mount system call allows you to associate a filesystem with a place in the file name hierarchy. The Mount call is implemented as part of the VFS code we supply (see Mount function in vfs.c); you will implement your Init_GOSFS function so that VFS's mount code will call your function GOSFS_Mount() in gosfs.c. Among other things, it must ensure that the filesystem being mounted is GOSFS.
See the recitation slides for additional details on directory structure and how files are organized.
For this project, we have provided you with a buffer cache implementation, which you can use by including the <geekos/bufcache.h> header file. The Buffer_Cache data structure represents a buffer cache for a particular filesystem instance. You can create a new buffer cache by passing the block device to the Create_FS_Buffer_Cache() function. A buffer containing the data for a single filesystem block is represented by the FS_Buffer data type. You can request a buffer for a particular block by calling the Get_FS_Buffer() function. The actual memory containing the data for the block is pointed to by the data field of FS_Buffer. If you modify the data in a buffer, you must call the Modify_FS_Buffer() function to let the buffer cache know that the buffer is now dirty. When your code has finished accessing or modifying a buffer, it should be released using the Release_FS_Buffer function.
Buffers are accessed in a transactional manner. Only one thread can use a buffer at a time. If one thread is using a buffer and a second requests the same buffer (by calling Get_FS_Buffer()), the second thread will be suspended until the first thread releases the buffer (by calling Release_FS_Buffer()). Note that you need to be careful never to call Get_FS_Buffer() twice in a row, since that will cause a self-deadlock.
The GeekOS buffer cache writes dirty buffers to disk as needed. If you want to immediately write the contents of a buffer back to the disk, you can call the Sync_FS_Buffer() function. It is generally a good idea to write back modified filesystem buffers when they contain filesystem metadata such as directories or disk allocation bitmaps; by writing these blocks eagerly, the filesystem is less likely to be seriously corrupted if the operating system crashes or if the computer is turned off unexpectedly. You can flush all of the dirty buffers in a buffer cache by calling the Sync_FS_Buffer_Cache(); this is useful for implementing the Sync() operation for your mounted filesystems.
The Destroy_FS_Buffer_Cache() function destroys a buffer cache, first flushing all dirty buffers to disk. This may be useful in your implementation of GOSFS_Format().
Here are guidelines for how you should set up your filesystem on the disk.
When you do a Format() , you make a raw disk usable with GOSFS. That is:
You have to implement the semantics of the new system calls as described below. As you can see, the semantics is very similar to the UNIX file system. Some notes:
|Call||User Function||Return on success||Return on failure||Reasons for failure||Comment|
|SYS_MOUNT||Mount(char *dev, char *prefix, char *fstype)||0||-1||
|SYS_OPEN||Open(char *path, int mode)||new file descriptor number||-1||
|SYS_OPENDIRECTORY||Open_Directory(char *path)||new file descriptor number||-1||
||if Delete(file) is called and file is still open in other threads or even in the thread that called Delete(), all the subsequent operations on that file (except Close()) should fail|
|SYS_READ||Read(int fd, char *buffer, int length)||number of bytes read||-1||
|SYS_READENTRY||Read_Entry(int fd, VFS_Dir_Entry *entry)||-1||
|SYS_WRITE||Write(int fd, char *buffer, int length)||number of bytes written||-1||
|SYS_STAT||Stat(char *path, VFS_File_Stat *stat)||0||-1||
|SYS_FSTAT||FStat(int fd, VFS_File_Stat *stat)||0||-1||
|SYS_SEEK||Seek(int fd, int offset)||0||-1||
||offset is an absolute position; for files: could be equal to fileSize or greater, then Write appends, see above. For directories, it seeks to the absolute directory entry, and is an error if there's no valid entry at the given index.|
||Should NOT create directories recursively; e.g. CreateDirectory("/d/d1/d2/d3/d4"), will fail if /d/d1/d2/d3 does not exist already.|
|SYS_SYNC||Sync(void)||0||-1||None||Forces any outstanding operations to be written to disk.|
|SYS_FORMAT||Format(char *dev, char *fstype)||0||-1||
||You only have to implement the format of the GOSFS filesystem.|
You may also have to make a few small changes if you do not implement on top of project 2. These were listed in the project 3 description. In short, you will have to make Sys_RegDeliver, Sys_Signal, and Check_Pending_Signal return 0, and you will have to provide a minimal implementation of Get_Next_Runnable in kthread.c; you can replace the TODO with
best = Find_Best(&s_runQueue); KASSERT(best != 0); Remove_Thread(&s_runQueue, best);For details on merging your sources, please refer to the "Final Notes" section in Project 3 right at the end.
Here are some must do's for the project: