Sorry, you need to enable JavaScript to visit this website.

Enabling AI Storage Benchmark Evolution at the Pace of AI

Grand Mesa F

Wed Apr 29 | 3:30pm

Abstract

The rapid acceleration of AI model complexity, scale, and diversity has exposed a widening gap between real world AI storage workloads and the benchmarks traditionally used to evaluate storage systems. As AI evolves from large scale training to retrieval augmented generation, vector databases, and KV cache intensive inference — storage systems face new I/O patterns, new bottlenecks, and new expectations for determinism, bandwidth, and latency. Accurately representing these behaviors in a benchmark is now as challenging as architecting the storage itself.
This session explores the emerging landscape of AI focused storage benchmarking, with emphasis on two major industry efforts: MLPerf Storage v3.0, the next iteration of MLCommons’ benchmark suite, and the newly forming SNIA AI Data Workloads Technical Working Group (TWG). Together, these groups aim to close the gap between how AI workloads behave in production and how storage devices are evaluated in labs.
We will outline the core challenges in representing AI behavior at the storage layer and identify the specific pain points storage vendors and system architects face when attempting to reproduce realistic AI I/O, and why traditional metrics such as throughput and latency from 4-corners synthetic tests fail to reflect true system performance.
The session will also introduce who in the industry is working to solve this problem and how:
•        MLPerf Storage v3.0 efforts to incorporate new AI models, synthetic dataset generators, and expanded training/inference pipelines.
•        SNIA’s AI Data Workloads TWG, focused on defining open, standardized approaches for characterizing AI storage behavior across retrieval, training, inference, vector DBs, KV cache management, and GPU initiated I/O.
As AI workloads continue to evolve at unprecedented speed, so too must the benchmarks that evaluate storage solutions. This talk provides the roadmap for how the industry is rising to that challenge, and how attendees can participate in shaping the next era of AI aligned storage benchmarking.

Download PDF