CMSC 412 Project #5

File System

Due Friday, Dec 3rd, 2004 (6:00 PM), no late submission accepted

Submissions received before Wednesday, Dec 1st, 2004 (6:00 PM) will get 10% extra credit

  • Grading Criteria
  • Submission Instructions
  • Slides used in recitation
  • New Project 5 guideline

    New project files

  • project5-cyclone.tgz - new project distribution
  • Updated syscall.c
  • Updated vfs.h, vfs.c, pfat.c, gosfs.c
  • New Updated p5test.c

    Introduction

    The purpose of this project is to add a new filesystem to GeekOS, as well as the standard operations for file management.

    A working project 3 or project 4 is not required for this project. If you do not build this project on top of project 4, you must make the following changes:

  • Remove Init_VM from main.c
  • Remove Init_Paging from main.c
  • Modify the USER_IMP_C := uservm.c line to USER_IMP_C := userseg.c
  • GOSFS - GeekOS FileSystem

    The main part of this project is to develop a new filesystem for the GeekOS. This filesystem will reside on the second IDE disk drive in the Bochs emulator. This will allow you to continue to use your existing PFAT drive to load user programs while you test your filesystem. The second IDE disk's image is called diskd.img. It is 10MB by default.

    GOSFS will provide a filesystem that includes multiple directories and long file name support.

    VFS and file operations

    Since GEEKOS will have two types of filesystems (PFAT and GOSFS), it will have a virtual filesystem layer (VFS) to handle sending requests to an appropriate filesystem (see figure below). We have provided an implementation of the VFS layer in the file vfs.c. The VFS layer will call the appropriate GOSFS routines when a file operation refers a file in the GOSFS filesystem.

    The implementation of PFAT is in pfat.c. You will implement GOSFS in gosfs.c and relevant system calls in syscall.c. VFS picks the functions to call based on supplied structures containing function pointers, one for each operation. For example, see Init_PFAT in pfat.c: this initializes the PFAT filesystem with VFS by passing it a pointer to s_pfatFilesystemOps, a structure that contains function pointers to PFAT routines for mounting (and formatting) a filesystem.  Other PFAT functions are stored in different structures (e.g., look at the PFAT_Open routine, which passes the s_pfatFileOps structure to VFS).  You will analogously use s_gosfsFilesystemOps, s_gosfsMountPointOps, s_gosfsDirOps, and s_gosfsFileOps in gosfs.c. You should also add a call to Init_GOSFS provided gosfs.c in main.c to register the GOSFS filesystem.  In general, use the PFAT implementation as your guide to interfacing GOSFS with VFS.

    Open files are tracked in memory with a file descriptor, defined as struct File.  Each user space process will have a file descriptor table that records which files the process can currently read and write. A user process can have up to USER_MAX_FILES files open at once. The file descriptor table is implemented asa struct File *fileList[USER_MAX_FILES] array that you must add to struct User_Context. Note that not all the entries in the fileList are necessarily open files, since usually a process has less than USER_MAX_FILES files open at once. If fileList[i] is NULL, it represents a free slot (file descriptor is not used). This descriptor will be filled out by the code in VFS; e.g., see Open in vfs.h, whose pFile argument is a pointer to a free slot in the table.

    Your filesystem should support long filenames (at most 128 bytes, which includes a null at the end). A file's full path (which includes directories and subdirectories) will be no more than 1024 characters in total.

    You should keep track of free disk blocks using a bit vector (as described in the textbook, Section 12.5.1, page 431). A library called bitset is provided (see bitset.h and bitset.c) that manages a set of bits and provides functions to find bits that are 0 (i.e. correspond to free disk blocks).

    The block size for GOSFS is 4 KB; IDE sectors (a.k.a. disk blocks) are 512 bytes, so a block is 8 sectors.  Thus one bit in a bitset corresponds to a 4KB block. For example, a bitset that is 8192 bits (1024 bytes) will keep track of 8192 * 4KB = 32 MB of data.

    Directory Structure

    See the recitation slides for details on directory structure. Each directory in GOSFS takes up a single disk block. The structure of the directory is defined in gosfs.h. A directory is an array of struct GOSF_Dir_Entry of size GOSFS_DIR_ENTRIES_PER_BLOCK. Each filenode (directory entry) can represent either a file or a subdirectory.  That is, unlike UNIX where directory entries and inodes are separate, here all file information is tracked in its directory entry.

    The filenode for a directory is represented by the GOSFS_DIRENTRY_ISDIRECTORY flag set in the GOSFS_Dir_Entry->flags field. The location of the block that holds the data for the directory will be stored in the first entry in the blocksList array of the directory's filenode (hence entries blocksList[1]..blocksList[GOSFS_NUM_BLOCK_PTRS-1] are unused).

    Files

    Unlike directories, that have a fixed size (one block, irrespective of how many files they hold), files can take up an arbitrary number of disk blocks. You will use a version of indexed allocation to represent the data blocks of your filesystem. The blocksList field in the GOSF_Dir_Entry struct in the gosfs.h file keeps track of data blocks for a file. The first eight KB-blocks are direct blocks, the ninth points to a single indirect block, the tenth to a double indirect block. See textbook, Section 12.4.3 on p. 429 (and the recitation slides) for a detailed layout (though there is no triple indirect block in GOSFS as there is in the text example).

    Mounting

    The Mount system call allows you to associate a filesystem with a place in the file name hierarchy. The Mount call is implemented as part of the VFS code we supply (see Mount function in vfs.c); you will implement your Init_GOSFS function so that VFS's mount code will call your function GOSFS_Mount() in gosfs.c. Among other things, it must ensure that the filesystem being mounted is GOSFS. See new syscalls below for details.

    New System Calls

    You have to implement the sematics of the new system calls as described below. As you can see, the semantics is very similar to the UNIX file system.

  • NOTE: All user-supplied pointers (e.g. strings, buffers) must be checked for validity.
  • NOTE 2: Some of the checks in 'Reason for failure' column are already done by vfs.c so make you don't have to perform the check if it's implemented for you already.
  • Call User Function Return on success Return on failure Reasons for failure Comment
    SYS_MOUNT Mount(char *dev, char *prefix, char *fstype) 0 -1
  • a filesystem already mounted under name
  • illegal value for one of the parameters
  •  
    SYS_OPEN Open(char *path, int mode) new file descriptor number -1
  • name does not exist (if permissions don't include O_CREATE )
  • path to name does not exist (if permissions include O_CREATE )
  • attempt to open a directory, (should use OpenDirectory)
  • there's no create syscall, so setting O_CREATE will create the file. If the file exists, the call succeeds (return >= 0) but its data contents is not affected.
  • Should NOT create directories; e.g. if you do Open("/d/d1/d2/d3/xFile", O_CREATE) and the leading path /d/d1/d2/d3 does not exist, the syscall fails, returning -1
  • You should use the Allocate_File in vfs.c to get a File when opening.
  • The permissions values are flags and may be or'ed together in a call. For example:
    • O_CREATE | O_READ
    • O_READ | O_WRITE
    • O_CREATE | O_READ | O_WRITE
  • SYS_OPENDIRECTORY Open_Directory(char *path) 0 -1
  • path does not exist
  •  
  • Should NOT create directories, use CreateDirectory instead.
  • You should use the Allocate_File in vfs.c to get a File when opening.
  • SYS_CLOSE Close(int fd) 0 -1
  • fd not within range 0-(USER_MAX_FILES-1)
  • fd is not an open file
  •  
    SYS_DELETE Delete(char *path) 0 -1
  • name does not exist
  • name is a non-empty directory
  • if Delete(file) is called and file is still open in other threads or even in the thread that called Delete(), all the subsequent operations on that file (except Close()) should fail
    SYS_READ Read(int fd, char *buffer, int length) number of bytes read -1
  • fd not in range
  • fd is not an open file
  • fd was not open with O_READ flag
  • fd is a directory
  • it's OK if return value < length, for instance reading close to end of file
  • increase the filePos (the current position in the file) by the number of bytes read (if successful)
  • SYS_READENTRY Read_Entry(int fd, VFS_Dir_Entry *entry) 0 -1
  • fd not in range
  • fd is not an open file
  • fd was not open with O_READ flag
  • fd is not a directory
  • this reads a file out of a directory entry.
  • increase the filePos, if successful
  • VFS_Dir_Entry is defined in fileio.h
  • SYS_WRITE Write(int fd, char *buffer, int length) number of bytes written -1
  • fd not in range
  • fd is not an open file
  • fd was not open with O_WRITE flag
  • fd is a directory
  • increases filePos by the number of bytes written, if successful
  • "Grow on write"- allocate blocks "on the fly" if past end of file
  • "Grow minimum" - allocate only the needed blocks. A seek to a position past the end of the file should be allowed and a write will allocate only a single block rather than all blocks in between.
  • SYS_STAT Stat(char *path, VFS_File_Stat *stat) 0 -1
  • path is not valid
  • VFS_File_Stat is defined in fileio.h
  • SYS_FSTAT FStat(int fd, VFS_File_Stat *stat) 0 -1
  • fd not within 0-9
  • fd is not an open file
  • VFS_File_Stat is defined in fileio.h
  • SYS_SEEK Seek(int fd, int offset) 0 -1
  • fd not within 0-9
  • fd is not an open file
  • offset is an absolute position; could be equal to fileSize or greater, then Write appends, see above
    SYS_CREATEDIR Create_Directory(char *path) 0 -1
  • name already exists, as file or directory
  • regular file encountered on the path to name
  • Should NOT create directories recursively; e.g. CreateDirectory("/d/d1/d2/d3/d4"), will fail if /d/d1/d2/d3 does not exist already.
    SYS_SYNC Sync(void) 0 -1 None Forces any outstanding operations to be written to disk.
    SYS_FORMAT Format(char *dev, char *fstype) 0 -1
  • illegal value for dev
  • dev is in use, i.e. mounted
  • You only have to implement the format of the GOSFS filesystem.

    Disk Layout

    disk layout

    A guideline is provided above. The first block (0) is called the SUPERBLOCK, and it contains filesystem housekeeping data. Blocks >= 1 contain files and directories.

  • The Magic number at the very beginning could be 0xDEADBEEF, "GOSF" or the like. This tells you that the disk has a GOSFS filesystem on it. If you try to mount a drive and you don't find the magic signature, return error.
  • Root Dir Pointer holds the block number of the block containing the root directory.
  • Size is the size of the disk, in 4KB blocks. (32M / 4K = 8K for the example above)
  • Free Blocks Bitmap is : Size bits large, that is Size/8 bytes large. (8K / 8 = 1K for the example above). Every block has an associated bit.

    When you do a Format() , you make a raw disk usable with GOSFS. That is:

    1. Get drive's size, convert it in # of blocks. Get_Num_Blocks() tells you that.
    2. Figure out Free Blocks Bitmap size, mark them all free.
    3. Create a valid, but empty directory. That will be the root directory. Make Root Dir Pointer point to it.
    4. Mark superblock and block for root directory as used in the Free Blocks Bitmap
    5. If everything went OK, write the Magic. Now the disk is ready to be mounted and used.
    Keep in mind that the superblock and root directory have no associated GOSFS_Dir_Entry.

    How to create an arbitrarily big diskd.img

  • Change the size/disk geometry by changing the diskd line in .bochsrc
  • Change the argument to $(ZEROFILE) in Makefile. For $(ZEROFILE) one block is 512, so 4096 blocks = 2 MB, 65536 blocks = 32 MB and so on.

    Notes

    You do not need to consider situations where two processes have the same file open. You do not need to consider situations where one process opens the same file twice without closing it in between.

    Requirements

  • Make sure your Mount() works well, so that we can test your project. If we cannot Mount() a GOSFS, we cannot grade your project.
  • You might also want to mount /d to ide1 automatically in main() to speed up your testing, but the code you submit should not mount /d automatically.
  • You should support disk sizes of at least 32 MB. More than 32 MB is optional. Following the procedure described in the "How to create an arbitrary size big diskd.img" section above, in your submitted project, when someone types gmake, a 32 MB file should be created.
  • You should support file sizes of at least 5 MB (double indirect threshold crossed, yes). More than 5 MB is optional.

    Testing

    In src/user there are some programs that can be used to test your file management syscalls: cp.c, format.c, ls.c, mkdir.c, mount.c, p5test.c.