| SNIA | Experts on Data

Computational Storage Moving Forward with an Architecture and API

Read more about Computational Storage Moving Forward with an Architecture and API

The SNIA Computational Storage TWG is driving forward with both a CS Architecture specification and a CS API specification. How will these specification affect the growing industry Computational Storage efforts? Learn what is happening in industry organizations to make Computational storage something that you can buy from a number of vendors to move your computation to where your data resides. Hear what is being developed in different organizations to make your data processing faster and allow for scale-out storage solutions to multiply your compute power.

Quantum Technology and Storage: Where Do They Meet?

Read more about Quantum Technology and Storage: Where Do They Meet?

Although quantum technology can be leveraged to do many amazing things, it is not able to provide a general replacement for the storage capabilities we have today with HDDs and SSDs. However, there are a few things where quantum can be leveraged to provide some capabilities that are related to storage and this presentation will cover them. The presentation will start with a quick overview of some of the basic concepts of quantum technology and the reasons why quantum computing may potentially provide significant performance improvements over classical computing for certain applications.

SkyhookDM: An Arrow-Native Storage System

Read more about SkyhookDM: An Arrow-Native Storage System

With the ever-increasing dataset sizes, several file formats like Parquet, ORC, and Avro have been developed to store data efficiently and to save network and interconnect bandwidth at the price of additional CPU utilization. However, with the advent of networks supporting 25-100 Gb/s and storage devices delivering 1, 000, 000 reqs/sec the CPU has become the bottleneck, trying to keep up feeding data in and out of these fast devices.

High Performance NVMe Virtualization with SPDK and vfio-user

Read more about High Performance NVMe Virtualization with SPDK and vfio-user

Presenting paravirtualized devices to virtual machines has historically required specialized drivers to be present in the guest operating system. The most popular example is virtio-blk or virtio-scsi. These devices can be constructed using either the host system operating system (KVM, for example, can emulate virtio-blk and virtio-scsi devices for use by the guest) or by a separate user-space process (the vhost-user protocol can connect to these targets, typically provided by SPDK). However, only Linux currently ships with virtio-blk and virtio-scsi drivers built-in.

A Primer on GPUDirect Storage

Read more about A Primer on GPUDirect Storage

Extreme Compute needs Extreme IO. The convergence of HPC and AI are using GPUs in wider range of applications than ever before on multitude of platforms ranging from edge devices, commodity hardware to high performance supercomputers. Larger datasets enable more accurate AI models which gathers deeper information enabling enterprises to collect more and more data. This virtuous cycle is enabling the explosive demands in processing larger amounts of data and the need to reduce IO bottlenecks is greater than ever.

Beyond Zoned Named Spaces – ZNSNLOG Bridging the Semantic Gap

Read more about Beyond Zoned Named Spaces – ZNSNLOG Bridging the Semantic Gap

When data processing engines are using more and more log semantics, it’s natural to extend Zoned Namespace to provide a native log interface by introduce variable size, byte append-able, named zones. Using the newly introduced ZNSNLOG interface, not only the storage device enables less write amplification/more log write performance, but more flexible and robust naming service. Given the trends towards a compute-storage disaggregation paradigm and more capable computational storage, ZNSNLOG extension enables more opportunities for near data processing.

Distributed WorkLoad Generator for Load Testing Using Emerging Technologies

Read more about Distributed WorkLoad Generator for Load Testing Using Emerging Technologies

In DellEMC Enterprise Server/Storage Validation Organization, We perform Load testing using different workloads (Web, File, FTP, Database, Mail, etc.) on Servers to identify the performance of the Systems under heavy load. Knowing how DellEMC Enterprise Systems perform under heavy load (% CPU, % Memory, % Network, % Disk) is extremely valuable and critical. This is achieved with the help of a Load Testing Tools. Load testing tools available in market comes with its own challenges like Cost, Learning Curve and Workloads Support.

Compute Express Link 2.0: A High-Performance Interconnect for Memory Pooling

Read more about Compute Express Link 2.0: A High-Performance Interconnect for Memory Pooling

Data center architectures continue to evolve rapidly to support the ever-growing demands of emerging workloads such as artificial intelligence, machine learning and deep learning. Compute Express Link™ (CXL™) is an open industry-standard interconnect offering coherency and memory semantics using high-bandwidth, low-latency connectivity between the host processor and devices such as accelerators, memory buffers, and smart I/O devices.

Enabling Heterogeneous Memory in Python

Read more about Enabling Heterogeneous Memory in Python

Adopting new memory technologies such as Persistent Memory and CXL-attached Memory is a challenge for software. While libraries and frameworks (such as PMDK) can help developers build new software around emerging technology, legacy software faces a more severe challenge. At IBM Research Almaden we are exploring a new approach to managing heterogeneous memory in the context of Python. Our solution, PyMM, focuses on ease-of-use and is aimed primarily at the data science community. This talk will outline PyMM and discuss how it is being used to manage Intel Optane persistent memory.

Scientific Data Powered by User-Defined Functions

Read more about Scientific Data Powered by User-Defined Functions

Scientific data is commonly represented by static datasets. There is a myriad of sources for such data: snapshots of the Earth, as captured by satellites, sonars, and laser scanners; the output of simulation models; the aggregation of existing datasets, and more.

Subscribe to