| SNIA | Experts on Data

Chiplets, UCIe, Persistent Memory, and Heterogeneous Integration: The Processor Chip of the Future!

Read more about Chiplets, UCIe, Persistent Memory, and Heterogeneous Integration: The Processor Chip of the Future!

Chiplets have become a near-overnight success with today’s rapid-fire data center conversion to AI. But today’s integration of HBM DRAM with multiple SOC chiplets is only the very beginning of a larger trend in which multiple incompatible technologies will adopt heterogeneous integration to connect new memory technologies with advanced logic chips to provide both significant energy savings and vastly-improved performance at a reduced price point.

Storage Devices for the AI Data Center

Read more about Storage Devices for the AI Data Center

The transformational launch of GPT-4 has accelerated the race to build AI data centers for large-scale training and inference. While GPUs and high-bandwidth memory are well-known critical components, the essential role of storage devices in AI infrastructure is often overlooked. This presentation will explore the AI processing pipeline within data centers, emphasizing the crucial role of storage devices such as SSDs in compute and storage nodes. We will examine the characteristics of AI workloads to derive specific requirements for flash storage devices and controllers.

CXL Memory in Windows

Read more about CXL Memory in Windows

In this presentation, we will present the architecture of CXL memory in Windows and describe the support that will be available. We will describe the possible usages of CXL memory, the RAS workflows and the developer interfaces available to use CXL memory.

Storage Blending: The Evolving Role of HDD and SSD in Data Systems for an AI and Analytics Era

Read more about Storage Blending: The Evolving Role of HDD and SSD in Data Systems for an AI and Analytics Era

As the rapid expansion of AI and analytics continues, storage system architecture and total cost of ownership (TCO) are undergoing significant transformation. Emerging technologies such as HAMR in rotating storage and high-capacity, data center-grade QLC in flash promise to redefine the landscape for both hyperscale and OEM data storage solutions. But what will that evolution look like?

Accelerating Object Storage for AI/ML with S3 RDMA

Read more about Accelerating Object Storage for AI/ML with S3 RDMA

Amazon S3 is the de facto standard for object storage—simple, scalable, and accessible via HTTP. However, traditional S3 access via TCP/IP is CPU-intensive and not designed for the low-latency, high-throughput needs of modern GPU workloads. S3 RDMA aims to bridge that gap. S3 RDMA implements S3 object PUT/GET data transfers over RDMA, essentially bypassing the HTTP stack entirely.

Beyond Throughput: Benchmarking Storage for the Complex I/O Patterns of AI with MLPerf Storage and DLIO

Read more about Beyond Throughput: Benchmarking Storage for the Complex I/O Patterns of AI with MLPerf Storage and DLIO

Training state-of-the-art AI models, including LLMs, creates unprecedented demands on storage systems that go far beyond simple throughput. The I/O patterns in these workloads—characterized by heavy metadata operations, multi-threaded asynchronous I/O, random access, and complex data formats—present a significant bottleneck that traditional benchmarks fail to capture.

Command Duration Limits - Improving IOPS per TB in HDDs

Read more about Command Duration Limits - Improving IOPS per TB in HDDs

Command Duration Limits (CDL) is a QoS protocol for SCSI and ATA HDDs that provides the host with a model of traffic classes and command execution policies that enable a drive to optimize execution of consumed commands. The standard has a two dimensional model.

Global Distributed Client-side Caching for HPC/AI Storage Systems

Read more about Global Distributed Client-side Caching for HPC/AI Storage Systems

HPC and AI workloads require processing massive datasets and executing complex computations at exascale speeds to deliver time-critical insights. In distributed environments where storage systems coordinate and share results, communication overhead can become a critical bottleneck. This challenge underscores the need for storage solutions that deliver scalable, parallel access with microsecond latencies from compute clusters. Caching can help reduce communication costs when implemented on either servers or clients.

Towards Unified Knowledge Platforms: Evolving Storage Systems for Generative and Agentic AI

Read more about Towards Unified Knowledge Platforms: Evolving Storage Systems for Generative and Agentic AI

The rise of Generative and Agentic AI has driven a fundamental shift in storage —from storing data to functioning as comprehensive knowledge management systems. Traditional model of storing data and system metadata and providing analytical capabilities on top of it is now inadequate. Agentic AI workflows require access to semantically enriched representations of data, including embeddings and derived metadata (e.g., classification, categorization). As data is ingested, storage systems must support real-time or near-real-time generation and association of such metadata.

War Stories from the Storage Trenches: Moving Data Across NFS, SMB, and S3

Read more about War Stories from the Storage Trenches: Moving Data Across NFS, SMB, and S3

"How hard can it be to copy files and objects from one storage system to another? I could write a script myself or use one of the free tools out there. Easy, right?"

Subscribe to