Distributed File Systems

Draft for the term seminar by Henrique Andrade - Oct/97

  1. Introduction
  2. The distributed file systems area is a broad research area. There are papers from the beginning of 1980s [1] and a lot of both academic research and commercial proposals in the mid-1980s [2,3].

    The more recent research deals with the need for improvements in availability of the systems and the integration/adaptation to the Web scenario [4,5]. Both aspects consider that the distributed file systems tend to integrate a broader range of server, not only the corporate servers of a single institution.

    This document presents a summary of four articles that seems to present an overview of the issues related to distributed systems and also a references survey.

  3. An Example of a Distributed File System

The white paper [8] presents the Open Software Foundation’s Distributed File System (OSF DFS).

Paper Summary

The main motivation for the utilization of a distributed file system relies on the ease of information sharing achieved via the capability of exporting file systems. The main features such a system must accomplish are:

Data consistency: distributed file systems operate by allowing a user on a computer connected to a network to access and modify data stored in files on another computer. Thus a mechanism must be provided in order to ensure that each user can see changes that others are making to their copies of data. DFS uses a set of tokens to keep track of cached information.

  1. Uniform access: a distributed computing environment should support global file names. One mechanism that allows the name of a file to look the same on all computers is called a uniform name space. DFS specifies a naming convention with which all installations are required to respect. Another approach to solve this problem uses directory assistance systems.
  2. Security: distributed file systems must provide authentication. Furthermore, once users are authenticated, the system must ensure that the performed operations are permitted on the resources accessed. This process is called authorization. DFS uses Kerberos during the authentication phase and Access Control Lists (ACL) to perform the authorizations.
  3. Reliability: the distributed file system scheme itself improves the reliability because its distributed nature, that is, the elimination of the single point of failure of non-distributed systems. DFS uses file replication to achieve this goal, i.e., multiple copies of files on multiple servers.
  4. Availability: a distributed file system must allow systems administrators perform routine maintenance while the file server is in operation, without disrupting the user’s routines. This feature is achieved in DFS using the uniform name space (which allows file movements across the file system during administrative tasks) and the transaction logging (which allows the system to perform error-recovery procedures).
  5. Performance: the network is considerably slower than the internal buses. Therefore, the less clients have to access servers, the more performance can be achieved. DFS uses a cache (both of file status and real data) to lower the network load.
  6. Manageability: systems should provide a way of keep track of configuration information (e.g. location of files). DFS uses distributed databases for this task.
  7. Standard conformance: DFS complies with the IEEE POSIX 1003.1 file systems semantics standard.

 

  1. Performance Study of a Running Distributed File System
  2. The study of running distributed file systems can bring some insights for the real understanding of the issues of a system and it can show if the system can support the workload in the predicted way. Moreover, it can bring tips for new improvements. The paper [9] performs this study on the Andrew File System (AFS).

    Paper Summary

    The key motivation for this study is the growth of installed distributed file systems. The growth brings problems such as: the level trust between users is lowered; failures tend to be more frequent; administrative coordination is more difficult; performance is degraded, etc.

    The AFS Design

    AFS was designed in 1983 at CMU with the foal of serving the campus community. It was intended primary for a LAN, but its underneath RPC mechanism is well-suited both for LANs and WANs, and so is the file system.

    AFS presents a location-transparent Unix file name space to client, using a set of trusted servers. Files and directories are cached on the local disks of clients using consistency mechanism based on callbacks. Directories are cached in their entirety, while files are cached in 64 KB chunks. All updates to a file are propagated to its server upon close. Directory modifications are propagated immediately.

    Backup, disk quota enforcement, and most other administrative operations in AFS operate on volumes (a set of files and directories located on one server and forming a partial subtree of the shared name space – either in a per user basis or a per project basis).

    AFS uses ACLs and the granularity of protection is an entire directory.

    AFS supports multiple administrative cells (servers, clients, system administrators and users). Each cell is completely environment, but a federation of cells can cooperate in an integrated way – a federation of cells.

    The Evaluation

    The goal of the evaluation is conducting an empirical study to understand how effective various AFS mechanisms were in a large-scale, wide-area context. It was presented evaluations concerning the AFS data profile, the performance and reliability, and the sharing outline.

    From the results, AFS seems to be effective at the experimented environment.

    Future role of AFS

    As demonstrated by the experiments, AFS can deal with large distributed file systems. Comparing with the data sharing mechanisms on the Internet (e.g. Archie, Gopher, WWW), AFS address many issues not addressed by that systems: caching, replication, location transparency, authentication and protection mechanisms. This would allow the utilization of such system as the basis for building a more scalable service on the Internet domain.

  3. Distributed File Systems and the Web

Success of Web has exposed some limitations of the underneath Web technology: its tendency to overload the network and servers, its limited ability to control access to sensitive data, its lack of mechanisms for data consistency and its susceptibility to network and server failures.

The paper [5] shows that Web and AFS are not really competing technologies. Rather, they represent complimentary technologies that may be used together for mutual advantage.

Paper Summary

Some comparisons between the characteristics and evolution show the complimentary aspects of these technologies:

Symbiosis

Because Web and AFS address different characteristics in complimentary ways, the possibility of combining their strengths can led to a better system. Two examples show how this can be done:

  1. New trends in development of distributed file systems
  2. In addition to the integration of distributed file systems technology to the Web, the growth of current systems brings new issues that require some re-design in previous systems.

    The paper [4] discusses these problems and presents the current efforts to solve them.

    Paper Summary

    Because the dependence on distributed file systems increases, the availability becomes a serious concern. For example, the AFS can cache entire files on local disks, which allow read access avoid server interaction. But there is write dependence yet. Thus AFS is sensitive to failures of servers and the network even though the dependence is minimal.

    The Coda File System [7] tries to solve these problems via two mechanisms: replication and inconsistency detection and exception handling mechanism. This way, inconsistency is tolerable in a distributed file system provided that it is rare and detectable, and provided there are mechanisms to help users resolve inconsistencies.

  3. References

[1] Walter, B., Popek, G. English, R., Kline, C., Thiel, G. The Locus Distributed Operating System. Proceeding of the Ninth ACM Symposium on Operating Systems Principles, Breton Woods. Oct 1983.

[2] Sandberg, R., Goldberg, D., Kleiman, S., Walsh, D., Lyon, B. Design and Implementation of the Sun Network Filesystem. Summer Usenix Conference Proceedings. 1985.

[3] Satyanarayanan, M., Howard, J. H., Nichols, D. A., Sidebotham, R. N., Spector, A. Z., West, M. J. The ITC Distributed File System: Principles and Design. Proceedings of the Tenth ACM Symposium on Operating System Principles. Dec 1985.

[4] Satyanarayanan, M. Autonomy or Independence in Distributed Systems? Position Paper. Third ACM SIGOPS European Workshop. Sept. 1988, Cambridge, England.

[5] Satyanarayanan, M., Spasojevic, M. AFS and the Web: Competitors or Collaborators? Proceedings of the Seventh ACM SIGOPS European Workshop. Sept 1996, Connemara, Ireland

[6] Howard, J. H., Kazar, M. L., Menees, S. G., Nichols, D. A., Satyanarayanan, M.,Sidebotham, R. N., West, M. J. Scale and Performance in a Distributed File System. ACM Transactions on Computer Systems, Vol 6, No 1, Feb 1988.

[7] Kistler, J., Satyanarayanan, M. Disconnected Operation in the Coda File System. ACM Transactions on Computer Systems, Vol 10, No 1, Feb 1992.

[8] Open Software Foundation, Inc. File Systems in a Distributed Computing Environment. White Paper. 1991.

[9] Spasojevic, M., Satyanarayanan, M. An Empirical Study of a Wide-Area Distributed File System. Technical Report. Transarc Corporation. Nov 1994