Sorry, you need to enable JavaScript to visit this website.

SNIA Developer Conference September 15-17, 2025 | Santa Clara, CA

Scalable Metadata in Distributed File Systems: Revisiting the GoogleFS Design for Exabyte-Scale Namespaces

Abstract

GoogleFS introduced the architectural separation of metadata and data, but its reliance on a single active master imposed fundamental limitations on scalability, redundancy, and availability. This talk presents a modern metadata architecture, exemplified by SaunaFS, that eliminates the single-leader model by distributing metadata across multiple concurrent, multi-threaded servers. Metadata is stored in a sharded, ACID-compliant transactional database (e.g., FoundationDB), enabling horizontal scalability, fault tolerance through redundant metadata replicas, reduced memory footprint, and consistent performance under load. The result is a distributed file system architecture capable of exabyte-scale operation in a single namespace while preserving POSIX semantics and supporting workloads with billions of small files.

Learning Objectives

Understand the limitations of single-leader metadata architectures in distributed file systems, including bottlenecks in scalability, availability, and fault tolerance. Learn how to design a distributed metadata service using concurrent, multi-threaded servers to eliminate the need for a central coordinator. Explore the role of sharded, ACID-compliant transactional databases (e.g., FoundationDB) in enabling scalable, consistent, and highly available metadata storage. Discover architectural patterns that support exabyte-scale namespaces with billions of files while preserving POSIX semantics. Understand the challenges and architectural transitions in moving from a single metadata server (MDS) leader to a fully distributed metadata service.