SNIA Developer Conference September 15-17, 2025 | Santa Clara, CA
Most programs have traditionally been divisible into either "in-memory" or "database" applications. When writing in-memory applications, all that matters for speed is the efficiency of the algorithms and data structures in the CPU/memory system during the run of the program, whereas with database applications the main determinant of speed is the number and pattern of accesses to storage (e.g., random accesses vs. sequential ones). The reason that database application programmers don't need to be too concerned with in-memory algorithmic efficiency is that the time it takes for one storage access, typically measured in hundreds of microseconds, corresponds to tens or hundreds of thousands of CPU instruction times, which are measured in picoseconds. Thus, tuning the CPU part of the problem, other than by selecting a reasonable algorithm, is generally a waste of time in such applications. However, when programming with persistent memory, the luxury of concentrating on only one of these sets of constraints is gone. One access to persistent memory takes a few hundred nanoseconds, which is roughly the time it takes to execute 1000 sequential CPU instructions. Thus, with such storage devices, the efficiency of CPU processing is critical to overall program performance in a way that is not true with slower storage devices. This new significance of CPU processing time, along with the fact that persistent memory is accessed through the virtual memory mechanism, implies new restrictions on the design and implementation of such programs if optimal performance is to be obtained. A few of these restrictions are: 1. Random accesses to arrays or vectors that won't fit in the CPU caches must be avoided for optimal performance. 2. The time to calculate a hash code or similar operation, normally considered insignificant with a database program, can cause performance bottlenecks and must be executed in as efficient a manner as possible. 3. System calls that cause a context switch cannot be executed in the critical path because they take too much time (on the order of a microsecond). The design of a heterogeneous hash table tuned for persistent memory provides an instructive example of such restrictions.
Digital transformation is driving applications to process more data in memory and in real time. Lack of Infrastructure technologies to support such environments is pushing customers to either over-provision expensive hardware or deploy complex solutions. Enterprises and cloud providers are looking to provide more flexibility and improve efficiencies for their customers, all without requiring huge lift in costs, hardware infrastructure or applications. In this talk, we showcase VMware Software Memory Tiering including near term trends like CXL, PCIGen5, for memory disaggregation. Finally, we will walk you through tech previews showing value of VMware Software Memory Tiering with some real world mission critical Oracle workloads use cases.
Persistent scripting brings the benefits of persistent memory programming to high-level interpreted languages. More importantly, it brings the convenience and programmer productivity of scripting to persistent memory programming. We have integrated a novel generic persistent memory allocator into a popular scripting language interpreter, which now exposes a simple and intuitive persistence interface: A flag notifies the interpreter that a script’s variables reside in a persistent heap in a specified file. The interpreter begins script execution with all variables in the persistent heap ready for immediate use. New variables defined by the running script are allocated on the persistent heap and are thus available to subsequent executions. Scripts themselves are unmodified and persistent heaps may be shared freely between unrelated scripts.
Emerging memory technologies have gotten a couple of big boosts over the past few years, one in the form of Intel’s Optane products, and the other from the migration of CMOS logic to nodes that NOR flash, and now SRAM, cannot practically support. Although these appear to be two very different spheres, a lot of the work that has been undertaken to support Intel’s Optane products (also known as 3D XPoint) will lead to improved use of persistent memories on processors of all kinds: “xPUs”. In this presentation we will review emerging memory technologies and their roles in replacing other on-chip memories, the developments through SNIA and other organizations fostered by Optane, but usable in other aspects of computing, the emergence of new Near/Far Memory paradigms that have spawned interface protocols like CXL and OMI, and the emergence of “Chiplets,” and their potential role in the evolution of persistent processor caches.
Most programs have traditionally been divisible into either "in-memory" or "database" applications. When writing in-memory applications, all that matters for speed is the efficiency of the algorithms and data structures in the CPU/memory system during the run of the program, whereas with database applications the main determinant of speed is the number and pattern of accesses to storage (e.g., random accesses vs. sequential ones). The reason that database application programmers don't need to be too concerned with in-memory algorithmic efficiency is that the time it takes for one storage access, typically measured in hundreds of microseconds, corresponds to tens or hundreds of thousands of CPU instruction times, which are measured in picoseconds. Thus, tuning the CPU part of the problem, other than by selecting a reasonable algorithm, is generally a waste of time in such applications. However, when programming with persistent memory, the luxury of concentrating on only one of these sets of constraints is gone. One access to persistent memory takes a few hundred nanoseconds, which is roughly the time it takes to execute 1000 sequential CPU instructions. Thus, with such storage devices, the efficiency of CPU processing is critical to overall program performance in a way that is not true with slower storage devices. This new significance of CPU processing time, along with the fact that persistent memory is accessed through the virtual memory mechanism, implies new restrictions on the design and implementation of such programs if optimal performance is to be obtained. A few of these restrictions are: 1. Random accesses to arrays or vectors that won't fit in the CPU caches must be avoided for optimal performance. 2. The time to calculate a hash code or similar operation, normally considered insignificant with a database program, can cause performance bottlenecks and must be executed in as efficient a manner as possible. 3. System calls that cause a context switch cannot be executed in the critical path because they take too much time (on the order of a microsecond). The design of a heterogeneous hash table tuned for persistent memory provides an instructive example of such restrictions.
Data persistence on CXL is an essential enabler toward the goal of instant-on processing. DRAM class performance combined with non-volatility on CXL enables a new class of computing architectures that can exploit these features and solve real-world bottlenecks for system performance, data reliability, and recovery from power failures. New authentication methods also enhance the security of server data in a world of cyberattacks.