Hadoop File System Internals

webinar

Author(s)/Presenter(s):

Dhruba Borthakur

Library Content Type

Presentation

Library Release Date

Focus Areas

Abstract

This talk describes the general design and some recent developments in the design of the Hadoop Distributed File System (HDFS). It talks about the three typical use cases of HDFS. As HDFS continues to move from a batch workload to a more real-time workload, this talk describes the changes done to HDFS to remove single-points-of-failure. This talk also describes the scale and growth of some of the largest HDFS clusters that are being deployed these days.

Learning Objectives

Cloud storage
Challenges with High Availability
Large scale fault tolerance