SNIA Developer Conference September 15-17, 2025 | Santa Clara, CA
NOTE: this paper was developed by Ziye Yang, a Staff Software Engineer at Intel and is being presented by colleague Yadong Li, a Principal Engineer in the Ethernet Products Group at Intel. In many usage cases, FaaS applications usually run or deployed in container/Virtual machine environment for isolation purpose. So, one of the challenges is how to quickly construct the execution environment for FaaS, which can be divided into two parts: Running time Execution environments (VM, container, process) and image/file systems(including code, libraries) construction or provisioning for running FaaS. To accelerate the cold start of FaaS, we can clearly see that there can be lots of work to optimize in these two parts. In this talk, we propose a novel approach to construct the FaaS’s image construction environment via IPU (infrastructure processing unit) instead of optimizing it in host. With our approach, there are the following benefits: 1 Performance and resource benefit, i.e., We reduce the image construction resource overhead in the host side; 2 Security benefits, i.e., When the images area constructed by IPU, IPU can present the images through VF/PF devices to the host. And the host can directly hotplug (including hot attach/detach) the devices to VM/containers, then mount the device to a specific mounting mount. And after the execution of the FaaS application, the IPU can immediately hot remove the devices from the host. Then the sensitive information leak can be avoided.
DPUs (data processing units) are an exciting new category of processor that complement CPUs and GPUs inside data centers. DPUs are fast at data-centric tasks, while they retain the full programmability of CPUs. DPUs have typically been used to offload networking and security functions from compute servers, but have not been used to build storage systems until recently. In this talk, we will briefly introduce DPUs, then highlight the use of DPUs to build storage systems. This is an emerging new use case for DPUs. We argue that full-featured, high performance, future storage systems will be built using DPUs. By examining the special requirements of storage processing, we will show why DPUs are architecturally better suited than CPUs for this task. We will describe the architecture of a DPU-based storage system in some detail. We then make a compelling case for why DPU based storage systems will be superior to storage systems built using general-purpose CPUs (aka Software-defined storage or SDS). Using concrete examples of real implementations of DPU and CPU based storage systems, we will show that DPU based systems can be anywhere from 3x-10x better than SDS on important metrics like cost, performance and power. To build competitive storage systems, vendors will have no choice but to use DPUs to build storage systems. We conclude that future storage systems will be built with DPUs and not general-purpose CPUs, arguably signaling the end of the Software Defined Storage era.
As hardware layers in storage systems (such as network and storage devices) continue to increase in performance, it is vital that the IO software stack does not fall behind and become the bottleneck. Leveraging the capabilities of computational storage devices, such data processing units (DPUs), allows for the IO software stack to accelerate CPU and memory bandwidth constrained operations in order to fully take advantage of the storage hardware in the system. DPUs offer the ability to make storage systems more efficient but has proven difficult to integrate them into IO software stacks to produce production-quality accelerated storage systems. Los Alamos National Laboratory (LANL) has worked on streamlining the process of integrating DPUs in order to improve file system performance and reduce the storage footprint on storage devices. The Data Processing Unit Services Module (DPU-SVC) was created in order to provide a standardized interface for DPUs to interface with, as well as a standardized interface for DPU consumers to use. The initial DPU consumer targeted was the open-source Zettabyte File System (ZFS). ZFS is a commonly used backend for LANL’s parallel filesystems due to the rich feature set available for data transformations (compression, deduplication) and data integrity functions (checksums, erasure coding). The ZFS Interface for Accelerators (Z.I.A.), which uses the consumer facing standard interface of the DPU-SVC, was added into ZFS to allow for data to be moved out of ZFS and into DPUs. These changes allow for transparent acceleration of ZFS operations: users do not have to modify their applications in order to enjoy the benefits provided by DPUs in the storage system. Using DPUs in coordination with the ZFS code base can increase overall write throughput to LANL storage systems by a factor of 10 to 30 times current storage system performance. This increase in performance is achieved through moving ZFS operations that were originally implemented in software such as compression, checksumming, and erasure coding to hardware accelerated implementations, which in turn frees up CPU and memory bandwidth for user applications. This talk will present technical details of how the DPU-SVC and Z.I.A. accelerate the ZFS IO stack through attached DPUs. Results will also be presented showing write performance improvements from using these layers with LANL scientific data sets and storage systems. Some background knowledge of computational storage, ZFS, and kernel modules would be beneficial to the audience.
NOTE: this paper was developed by Ziye Yang, a Staff Software Engineer at Intel and is being presented by colleague Yadong Li, a Principal Engineer in the Ethernet Products Group at Intel. In many usage cases, FaaS applications usually run or deployed in container/Virtual machine environment for isolation purpose. So, one of the challenges is how to quickly construct the execution environment for FaaS, which can be divided into two parts: Running time Execution environments (VM, container, process) and image/file systems(including code, libraries) construction or provisioning for running FaaS. To accelerate the cold start of FaaS, we can clearly see that there can be lots of work to optimize in these two parts. In this talk, we propose a novel approach to construct the FaaS’s image construction environment via IPU (infrastructure processing unit) instead of optimizing it in host. With our approach, there are the following benefits: 1 Performance and resource benefit, i.e., We reduce the image construction resource overhead in the host side; 2 Security benefits, i.e., When the images area constructed by IPU, IPU can present the images through VF/PF devices to the host. And the host can directly hotplug (including hot attach/detach) the devices to VM/containers, then mount the device to a specific mounting mount. And after the execution of the FaaS application, the IPU can immediately hot remove the devices from the host. Then the sensitive information leak can be avoided.
In this presentation, we will describe a complete end-to-end Software Defined Storage (SDS) solution for cloud data centers using Infrastructure Processing Units (IPUs). IPUs provide a high performance NVMe interface to host, abstracting away the details of networked storage and enabling storage disaggregation and bare-metal hosting. NVMe/TCP is a high performance protocol widely deployed because of its ease of deployment and better scalability in large scale-out networks. Integration of an IPU-based NVMe/TCP initiator with a Kubernetes CSI plugin for a clustered NVMe/TCP target provides a full software defined storage solution for IPU-equipped hosts. The outline of the presentation: - Overview of IPU architecture and SPDK-based NVMe/TCP Initiator Design - Overview of the LightBits Cloud Data Platform, a full-featured clustered NVMe/TCP target - Integration of IPU-based NVMe/TCP Initiator and backend storage service -- IPU Storage Management Agent (SMA) -- Integration with K8s CSI node driver for orchestration - Summary and Call to Action to enhance orchestration frameworks for IPU based storage solutions
Local disk emulation using domain-specific hardware poses a great opportunity for innovation in the storage domain. Standard host-side drivers like NVMe or virtio-blk and legacy applications can be enabled to access disaggregated storage at scale using state-of-art protocols like NVMe-oTCP while increasing performance through offload of storage services to the hardware (SmartNIC/DPU/IPU/xPU). In this talk, we present how IPDK, an open-source, vendor-agnostic framework of drivers and APIs for infrastructure offload and management can be used to dynamically create multiple virtual storage devices which a tenant uses to access a remote storage target using standard para-virtualized host-side virtio-blk or NVMe drivers. The attendees will learn how they can use IPDK to exercise the above scenario in a fully containerized environment using KVM-based IPU simulation suitable for rapid prototyping and then run their use cases on accelerated platforms. To demonstrate the HW-agnosticism of the solution we will show how the exact same host-side SW-stack used for prototyping can be used with a real HW-accelerated FPGA-based SmartNIC or an ASIC-based IPU.
A new class of cloud and datacenter infrastructure is emerging into the marketplace. This new infrastructure element, often referred as Data Processing Unit (DPU) or Infrastructure Processing Unit (IPU), takes the form of a server hosted PCIe add-in card or on-board chip(s), containing one or more ASIC’s or FPGA's, usually anchored around a single powerful SoC device. The DPU/IPU-like devices have their roots in the evolution of SmartNIC devices but separate themselves from that legacy in several important ways. The OPI project has been created to address these questions and to foster the emergence of such an open and creative software eco-system for DPU/IPU based cloud infrastructure. The project intends to delineate what a DPU/IPU is, to loosely define a framework(s) and architecture for a DPU/IPU-based software stack(s) applicable to any vendors hardware solution, to allow the creation of a rich open source application ecosystem, to integrate with existing open source projects aligned to the same vision such as the Linux kernel, IPDK.io, DPDK and SPDK to create new APIs for interaction with and between the elements of the DPU/IPU ecosystem: * the DPU/IPU hardware * DPU/IPU hosted applications * the host node * remote provisioning software * remote orchestration software