Submitted by Anonymous (not verified) on

HDDs have been the traditional hardware infrastructure for object stores like S3, Google Cloud and Azure Blob in data lakes .But as AI solution deployment transitions to production scale in organizations (Meta's Tectonic-Shift platform being a good example), it begins to impose demands on the data storage ingestion pipeline which have not been seen before. With Deep Learning Recommendation Model (DLRM) training as an AI use case, we first introduce the challenges object stores can expect to face as AI deployments scale. These include the growth in the scale of available data, the growth of faster training GPUs, and the growth in AI/ML ops deployment. We then explain how flash storage is well positioned to meet the needs of bandwidth and power that these systems require. We will share key observations from storage trace analysis of a few MLPerf DLRM preprocessing and training captures. We will conclude with a call to action for more work on standardizing benchmarks to characterize data ingestion performance and power efficiency.

Bonus Content
Off
Presentation Type
Presentation
Learning Objectives

Understand the role of data ingestion in the AI pipeline
Describe how has AI deployment at scale changed and what is expected of object stores?
Understand how flash storage can contribute to address this problem and what more is needed?

Start Date/Time
End Date/Time
YouTube Video ID
wgHAhbXuVKc
Zoom Meeting Completed
Off
Main Speaker / Moderator
Room Location
Cypress