Abstract
The PFFS is a POSIX compliant parallel file system capable of high resiliency and scalability. The user data is dispersed across the cluster with no replication thus providing significant savings.The resiliency level is selected by the user. Peer failures do not disrupt applications as the cluster automatically performs on the fly repairs as required for read and write operations to complete successfully (applications can read and write data from failed peers). There are two main protocols for communication between peers: the CLI protocol for namespace type commands (e.g. link, unlink, symlink, mkdir, rmdir, etc.) and the MBP protocol for file I/O. Both protocols are highly efficient and produce very little chatter. They rely on multicast and inference to preserve efficient scalability as the peer count grows large. The software is highly threaded to parallelize network I/O, disk I/O and computation. Gateways provide access to the cluster by exporting VFS semantics and are accessible to both NFS and CIFS. Gateways have no persistent user data as the peers are the only persistent repository. The configuration and administration of the cluster (both gateways and peers) is very simple and consist of a small text file of a few lines. Healing the cluster is highly efficient as they walk the file system and so only process occupied data blocks.
Learning Objectives
Resiliency design considerations for large clusters
The efficient use of multicast for scalability
Large clusters must administer themselves
Fault injection when failures are the nominal condition
Next step: 64K peers