| SNIA | Experts on Data

Compute Express Link (CXL) as a Scalable and Highly Cost-Effective Memory Architecture for Modern Computing Workloads

This presentation examines the transformative role of Compute Express Link (CXL) technology in addressing the critical challenges of memory scaling within modern computing architectures. As data-intensive applications in graph analytics, machine learning, and recommendation engines continue to demand exponentially increasing memory resources, traditional memory architectures face significant limitations in scalability and efficiency. The presentation provides a comprehensive analysis of current memory architecture challenges, including CPU memory channel limitations and resource utilization inefficiencies, before exploring how CXL's innovative protocol stack offers solutions through its three key components: CXL.IO, CXL.cache, and CXL.mem.

We present some real data points from our experiment using distributed in-memory database as workload running on a clustered platform with CXL used as memory expander, well complemented by Samsung Cognos. The experiment shows 5x capacity increase and meeting the targeted performance/SLA.

These findings suggest that CXL represents a fundamental shift in memory architecture design, offering a viable pathway for meeting the escalating memory demands of next-generation computing applications while maintaining efficiency and cost-effectiveness.

Pramod Peethambaran

Director of Engineering,

Samsung

Magic Memory

Advantages of CXL Memory Pooling and Tiering

This presentation will explore the benefits of CXL memory pooling and tiering for storage devices. The session will also examine CXL-based storage applications which can be deployed in data centers.

Anil Godbole

MWG Co-Chair,

CXL Consortium

Magic Memory

Disrupting the GPU Hegemony: Can Smart Memory and Storage Redefine AI Infrastructure

AI infrastructure is dominated by GPUs — but should it be? As foundational model inference scales, performance bottlenecks are shifting away from compute and toward memory and I/O. HBM sits underutilized, KVCache explodes, and model transfer times dominate pipeline latency. Meanwhile, compression, CXL fabrics, computational memory, and SmartNIC-enabled storage are emerging as powerful levers to close the tokens-per-second-per-watt gap. This panel assembles voices from across the AI hardware and software stack to ask the hard question: Can memory and storage innovation disrupt the GPU-centric status quo — or is AI destined to remain homogeneous?

You’ll hear from a computational HBM vendor, an AI accelerator startup, a compression IP company, a foundational model provider, and a cloud-scale storage architect: Potential panelists: computational HBM vendor (Numem), an AI accelerator startup (Recogni), a compression IP company(MaxLinear), a foundational model provider(Zyphra), and a cloud-scale storage architect (Solidigm). Together, they’ll explore: Why decode-heavy inference is choking accelerators — even with massive FLOPs Whether inline decompression and memory-tiering can fix HBM underutilization How model developers should (or shouldn’t) design for memory-aware inference Whether chiplet and UCIe-based systems can reset the balance of power in AI Expect live debate, real benchmark data, and cross-layer perspectives on a topic that will define AI system economics in the coming decade. If you care about performance-per-watt, memory bottlenecks, or building sustainable AI infrastructure — don’t miss this conversation.

Nilesh Shah

VP Business Development,

ZeroPoint Technologies

Jungmin Choi

Director of Memory Systems

Magic Memory

SNIA SDXI v1.1 Data Movement Accelerator Interface Update

Join SDXI TWG chair Shyam Iyer and Editor/Contributor William Moyes to learn what’s new in this SNIA standard for memory-to-memory data movement and acceleration. Learn the key differences from v1.0, how SDXI v1.1 improves extensibility and open-ness, and exciting new features added to v1.1. This talk will also briefly discuss software ecosystem enablement and opportunities to engage with the TWG.

Distinguished Engineer -- Technical Council, SDXI TWG Chair

Dell Technologies -- SNIA

Magic Memory

Towards Memory Efficient RAG Pipelines with CXL Technology

Various stages in the RAG pipeline of AI Inference involve large amounts of data being processed. Specifically, the preparation of data to create vector embeddings and the subsequent insertion into a Vector DB requires a large amount of transient memory consumption. Furthermore, the search phase of a RAG pipeline, depending on the sizes of the index trees, parallel queries, etc. also result in an increased memory consumption. We observe that the peak memory consumption is dependent on the load the RAG pipeline is under; whether the vectors are being inserted or updated and other such transient dynamic behaviors. Thus, we find that having local memory attached to meet the peak memory consumption is inefficient.

To improve the efficiency of RAG pipeline under the scenarios described, we propose the use of CXL based memory to meet the high memory challenges while reducing the statically provisioned local memory. In specific, we explore two approaches: 1.) CXL Memory Pooling: Provisioning memory based on dynamic and transient needs to reduce locally attached memory costs. 2.) CXL Memory Tiering: Using cheaper and larger capacity memory to reduce locally attached memory costs. We explore the current state of open-source infrastructure to support both solutions, and show that these solutions can result in significant DRAM cost saving for a minimal tradeoff in performance. Additionally, we comment on potential gaps in open source infrastructure and discuss potential ideas to bridge these gaps going forward.

Arun George

Associate Technical Director,

Samsung Semiconductor India Research

Roshan Nair

Staff Engineer

Samsung Semiconductor India Research, Bangalore

Magic Memory

Can SDXI work with NVMe?

SDXI is an emerging standard for a memory data movement and acceleration interface. NVMe is an industry leading storage access protocol. Memory transfers are integral to storage access, including NVMe.Data is transferred by DMA from host memory to device memory or from device memory to host memory. With SDXI as the data mover, data movement is standardized and new transformation (compute) capabilities are enabled. Transparent memory data movement within and across storage nodes remains an active area of optimization for NVM subsystems. Leveraging SDXI as an industry standard technology within and across storage nodes for memory data movement and transformation is prudent and necessary for storage OEMs. The SNIA SDXI + CS subgroup will present standardizing data movement within NVMe, leveraging SDXI transformations to manipulate data in-flight, and an example flow for transparent data movement across storage nodes.

Jason Molgaard

Principal Storage Solutions Architect,

Solidigm

Shyam Iyer

Distinguished Engineer -- Technical Council, SDXI TWG Chair

Dell Technologies -- SNIA

Disrupting the GPU Hegemony: Can Smart Memory and Storage Redefine AI Infrastructure

Abstract

Learning Objectives