SNIA Developer Conference September 15-17, 2025 | Santa Clara, CA
With the growth of Containerized applications and Kubernetes as an orchestration layer, the ability to leverage these technologies within the storage device directly adds additional support to the implementation and parallel processing of data. By using an os-based Computational Storage Drive (CSD), a deployment of SPARK will be presented and the steps required to achieve this task. The ability to use a distributed processing operation and orchestrate it with the Host and the CSDs at the same time to maximize the benefits of the application deployment.
Learn what is happening in NVMe to support Computational Storage devices. The development is ongoing and not finalized, but this presentation will describe the directions that the proposal is taking. Kim and Stephen will describe the high level architecture that is being defined in NVMe for Computational Storage. The architecture provides for programs based on a standardized eBPF. We will describe how this new command set fits within the NVMe I/O Command Set architecture. The commands that are necessary for Computational Storage will be described. We will discuss a proposed new controller memory model that is able to be used for computational programs.
Standardized computational storage services are frequently touted as the Next Big Thing in building faster, cheaper file systems and data services for large-scale data centers. However, many developers, storage architects and data center managers are still unclear on how best to deploy computational storage services and whether computational storage offers real promise in delivering faster, cheaper – more efficient – storage systems. In this talk we describe Los Alamos National Laboratory’s ongoing efforts to deploy computational storage into the HPC data center. We focus first on describing the quantifiable performance benefits offered by computational storage services. Second, we describe the techniques used at Los Alamos to integrate computational storage into ZFS, a fundamental building block for many of the distributed storage services provided for Los Alamos scientists. By developing ZIA, the ZFS Interface for Accelerators, Los Alamos is able to embed data processing elements along the data path and provide hardware acceleration for data intensive processing tasks currently performed on general purpose CPUs. Finally, we describe how computational storage is leading to a fundamental re-architecture of HPC platform storage systems and we describe the lessons learned and practical limitations when applying computational storage to data center storage systems.
There is a new architectural approach to accelerating storage-intensive databases and applications which take advantage of new techniques that dramatically accelerate database performance, improve response times and reduce infrastructure cost at massive scale. This new architecture efficiently stores fundamental data structures increasing space savings—up to 80%—over host- and software-based techniques that cannot address inherent inefficiencies in today’s fastest SSD technology. With novel data structures and algorithms it is a unique data processing approach to fundamentally simplify the storage stack.
The SNIA Computational Storage TWG is driving forward with both a CS Architecture specification and a CS API specification. How will these specification affect the growing industry Computational Storage efforts? Learn what is happening in industry organizations to make Computational storage something that you can buy from a number of vendors to move your computation to where your data resides. Hear what is being developed in different organizations to make your data processing faster and allow for scale-out storage solutions to multiply your compute power.
With the ever-increasing dataset sizes, several file formats like Parquet, ORC, and Avro have been developed to store data efficiently and to save network and interconnect bandwidth at the price of additional CPU utilization. However, with the advent of networks supporting 25-100 Gb/s and storage devices delivering 1, 000, 000 reqs/sec the CPU has become the bottleneck, trying to keep up feeding data in and out of these fast devices. The result is that data access libraries executed on single clients are often CPU-bound and cannot utilize the scale-out benefits of distributed storage systems. One attractive solution to this problem is to offload data-reducing processing and filtering tasks to the storage layer. However, modifying legacy storage systems to support compute offloading is often tedious and requires an extensive understanding of the internals. SkyhookDM introduces a new design paradigm for building computational storage systems by extending existing storage systems with plugins. Our design allows extending programmable object storage systems by embedding existing and widely used data processing frameworks and access libraries into the storage layer with minimal modifications. In this approach data processing frameworks and access, libraries can evolve independently from storage systems while leveraging the scale-out, availability, and failure recovery properties of distributed storage systems. SkyhookDM is a data management system that allows data processing tasks to the storage layer to reduce client-side resources in terms of CPU, memory, and network traffic for increased scalability and reduced latency. On the storage side, SkyhookDM uses the existing Ceph object class mechanism to embed Apache Arrow libraries in the Ceph OSDs and uses C++ methods to facilitate data processing within the storage nodes. On the client-side, the Arrow Dataset API is extended with a new file format that bypasses the Ceph filesystem layer and invokes storage side ceph object class methods on objects that make up a file in the filesystem layer. SkyhookDM currently supports Parquet as its object storage format but support for other file formats can be added easily due to the use of Arrow access libraries.
The computational storage ecosystem is growing fast, from IP providers to system integrators. How to benefit from this promising technology : design your own device and bring your added value at the device level, or save time and select an available product? This talk will present guidelines to use computational storage technology and a review of the computational storage building blocks, including computing, non-volatile memories, interfaces, embedded software, host software and tools.