MarFS: Near-POSIX Access to Object-Storage

webinar

Author(s)/Presenter(s):

Jeff Inman

Gary Grider

Library Content Type

Presentation

Library Release Date

Focus Areas

Cloud Storage Technologies

Data Governance & Security

Abstract

Many computing sites need long-term retention of mostly-cold data, often referred to as “data lakes”. The main function of this storage-tier is capacity, but non-trivial bandwidth/access requirements may also exist. For many years, tape was the most economical solution. However, data sets have grown larger more quickly than tape bandwidth has improved, such that disk is now becoming more economically feasible for this storage-tier. MarFS is a Near-POSIX File System, storing metadata across multiple POSIX file systems for scalable parallel access, while storing data in industry-standard erasure-protected cloud-style object stores for space-efficient reliability and massive parallelism. Our presentation will include: Cost-modeling of disk versus tape for campaign storage, Challenges of presenting object-storage through POSIX file semantics, Scalable Parallel metadata operations and bandwidth, Scaling metadata structures to handle trillions of files with billions of files/directory, Alternative technologies for data/metadata, and Structure of the MarFS solution.