Integrating Computational Storage Devices into Storage Software Stacks
Storage systems leverage data transformations such as compression, checksums, and erasure coding. These transformations are necessary to save on capacity and protect against data loss. These transformations however are both memory bandwidth and CPU intensive. This is leading to a large disparity between the performance from the storage software layers and the storage devices backing the data. This disparity only continues to grow as NVMe devices provide increasing bandwidth with each new PCIe generation. Computational storage devices (accelerators) provide a path forward by offloading these resource intensive transformations to hardware designed to accelerate operations. However, integrating these devices in storage system software stacks has been a challenge: Each accelerator has its own custom API that must be integrated directly into the storage software. This leads to challenges in supporting different accelerators and maintaining custom code for each.
This challenge has been solved by the Data Processing Unit Services Module (DPUSM) kernel module that provides a uniform API for storage software stacks to communicate with any accelerator. The storage software layers leverage the DPUSM API, and accelerator vendors can write code specific to their device through the API. This separation allows for accelerators to seamlessly integrate with storage system software. This talk will highlight how the DPUSM is being leveraged with the Zettabyte File System (ZFS) through the ZFS Interface for Accelerators (Z.I.A.). ZFS can now use different accelerators for data transformations that can lead to a 16x speed up in performance.