Abstract
Faster storage media, faster interconnection networks, and improvements in systems software have significantly mitigated the effect of I/O bottlenecks in HPC applications. Even so, applications that access a large number of small files are limited by the ability of the underlying filesystem to handle such workloads efficiently.
We believe that an important factor preventing existing HPC filesystems from scaling out is the continued use of a single, globally consistent filesystem namespace to serve all applications running on a single computing environment. Having a shared filesystem namespace accessible from anywhere in a computing environment has many benefits, but it increases each application process' communication with the filesystem's metadata servers for ordering concurrent filesystem metadata changes.
At exascale and beyond, synchronization of anything global should be avoided. This talk re-imagines filesystem as transient service code running inside the application it is serving. Each application has its own namespace. The amount of resources and components the namespace uses is scaled and customized to the task at hand. We discuss two filesystem scenarios: scalable namespace service for fast access to files, and in-situ output indexing to accelerate analysis operations.