Abstract
This presentation discusses the use of distributed relational database techniques to enable exascale file system and HSM metadata management for large-scale Linux clusters. We present metadata management techniques for: scaling a POSIX namespace to a trillion files, file storage allocations information, and file content knowledge over a Linux cluster. We address major data management challenges while focusing on critical needs to enable scalability for exascale era computing and file storage. We present test results from a Linux cluster using High Performance Storage System (HPSS) & DB2 software to show architectural approaches that surpass the file storage limitations of a single system in a cost effective and efficient manner.
Learning Objectives
Managing a distributed POSIX namespace
Balancing metadata, storage, and workload across storage servers
File knowledge using a flexible, scalable XML architecture