Sorry, you need to enable JavaScript to visit this website.

SNIA Developer Conference September 15-17, 2025 | Santa Clara, CA

Simulating CXL.mem for Fun and Profit

Abstract

CXL.mem enables hosts to expand their memories beyond individual servers and access memory regions using load and store instructions. In addition, CXL.mem enables memory sharing among its endpoints. Realizing memory sharing requires extending the coherency management protocol beyond individual hosts. Hosts and devices need to track the state of each memory region using individual finite state machines. This enables devices to modify the state of memory at specific hosts when needed, which is referred to as back-invalidation snooping. Our goal is to democratize the exploration of large-scale CXL.mem deployments using simulations. In this talk, we describe the design and implementation of our packet-level CXL.mem simulator consisting of hosts, CXL switches, and CXL endpoints. We implemented our simulator on top of SimPy, which is a discrete-event simulation framework based on Python. Users can use the simulator APIs to (i) construct CXL topologies by connecting hosts with their CXL endpoints, switches, and devices, (ii) define control and data flow of applications, and (iii) submit applications to specific hosts for execution. The simulator can also be utilized to explore new CXL switch architectures.  Next, we show how our simulator helps explore a variety of use cases. First, we characterize the overheads of CXL coherency to realize memory sharing. These overheads include access, serialization, snooping, and eviction delays. Second, we discuss how our simulator helps analyze the performance of existing task schedulers and memory allocators. In addition, we describe how the simulator enables designing a new CXL-aware task scheduler based on our simulation-driven insights.

Learning Objectives

The need of simulations to gain insights of CXL performance Identify the key challenges of CXL memory sharing Utilize simulation-driven insights to optimize CXL memory sharing