SDC EMEA 2021 Abstracts


Plenary Abstracts

Unlocking the potential of NVMe-oF and Software Defined Storage thanks to programmable DPUs

Sebastien Le Duc


During the last two decades, the data center world has been moving to a “Software Defined Everything” paradigm. This has been taken care of mostly by hypervisors running on the x86 up to recently.

In parallel, a new communication protocol to interface with SSDs has been specified from the ground-up, allowing to fully exploit the levels of parallelism and performances of all-flash storage: NVMe, and NVMe-oF. NVMe-oF promises to enable the performances of direct attached all-flash storage with the flexibility an TCO savings of shared storage. To fully unlock the benefits of NVMe-oF while keeping the software defined paradigm, we believe a new kind of processor is needed: the Data Processing Unit, or DPU.

Evolution of Load-Store I/O to meet the needs of Compute and Storage Infrastructure

Dr Debendra Das Sharma


Emerging and existing applications with cloud computing, 5G, IoT, automotive, and high-performance computing are causing an explosion of data. This data needs to be processed, moved, and stored in a secure, reliable, available, cost-effective, and power-efficient manner. Heterogeneous processing, tiered memory and storage architecture, infrastructure accelerators, and infrastructure processing units are essential to meet the demands of this evolving compute and storage landscape. These requirements are driving significant innovations across compute, memory, storage, and interconnect technologies. Compute Express Link* (CXL) with its memory and coherency semantics on top of PCI Express* (PCIe) is paving the way for the convergence of memory and storage with near memory compute capability. Pooling of compute, memory, and storage resources with CXL interconnect will lead to rack-scale efficiency with efficient low-latency access mechanisms across multiple nodes in a rack with advanced atomics, acceleration, smart NICs, and persistent memory support. In this talk we will explore how the evolution in load-store interconnects will benefit the compute and storage infrastructure of the future.  

Computational Storage: from edge to cloud, which technology for which use case?

Jerome Gaysse


When designing a new system from the ground up, there are multiple ways to implement Computational Storage (CS) in a new architecture, depending on the form factor, the computing type (CPU or FPGA), the environment (edge, cloud).
This talk will present a review of the available CS technologies, and will explain how to evaluate which CS technology provide the best benefits for specific workloads, including storage services, database and AI. System architecture examples will be described with use case simulation results.

Track 1 Abstracts

SMB3 over QUIC: Deep Dive and Document Updates

Obaid Farooqi 


This talk will give an overview of what’s been added to the SMB3 protocol documentation for the latest release as well as a deep dive into the using SMB3 over the QUIC transport. The SMB3 protocol is broadly deployed in enterprise networks and contains strong protection to enable its use more broadly. However, historically port 445 is blocked and management of servers on TCP have been slow to emerge. SMB3 now is able to communicate over QUIC, a new internet standard transport which is being broadly adopted for web and other application access. In this talk, we will provided updated details on the SMB3 over QUIC protocol and explore the necessary ecosystem such as certificate provisioning, firewall and traffic management and enhancements to SMB server and client configuration. 

Learning Objectives: You will learn the details about SMB3 on QUIC; how it works, the architecture and performance comparison with SMB on TCP. Documentation updates related to SMB3 on QUIC will also be presented.

Finding the right data in an ocean of bits

Glyn Bowden


Data science and machine learning are hugely popular topics at the moment, but those topics don’t do the actual work involved justice. Most of the time is not spent on tweaking the features of the model or pruning it to optimise for edge deployment. It's spent finding, assessing and them wrangling the data into something meaningful and useful that the algorithm can get to work on. In this talk we will explore some of the challenges facing the data hunters and look to some methods that can help speed the process along and allow for repeatable data operations.



The Role of Centralized Storage in the Emerging Interactive Cockpit

Or Lapid


In this session we discuss about the emerging technologies and architectures in future car cockpit environment and will present the use case and features of Centralized NVMe Storage in that architecture.
  1. We will explain the benefit of NVMe Automotive Grade SSD to different storage solutions like UFS which used on current and past IVI (In-Vehicle Infotainment) systems.
  2. We will discuss the use of NVMe namespaces, SSD security and benefits of centralized Storage as crucial parts for sharing information across various functions of the vehicle while keeping security granted.

Development of software-defined storage engine for 25 million IOPS

Dmitrii Smirnov


Modern CPU and NVMe technologies have made a breakthrough and have opened up new horizons for Software-Defined Storage performance. However, not everyone is aware of the new SDS performance potential or considers it unreachable, so current software solutions are not ready for the new hardware capabilities. We want to share our knowledge about developing an SDS engine that achieves 25 million IOPS per one storage node. We will describe challenges we met and resolved while developing a universal engine for the Linux kernel and for the SPDK. We will show performance optimizations and CPU-friendly hints, and discuss how to avoid mistakes when developing all-flash storage, how not to waste millions of IOPS, and how to reach new hardware limits.

Distributed WorkLoad Generator for Load Testing Using Emerging Technologies

Vishnu Murty


In DellEMC Enterprise Server/Storage Validation Organization, we perform Load testing using different workloads (Web, File, FTP, Database, Mail, etc.) on Servers to identify the performance of the Systems under heavy load. Knowing how DellEMC Enterprise Systems perform under heavy load (% CPU, % Memory, % Network, % Disk) is extremely valuable and critical. This is achieved with the help of a Load Testing Tools. Load testing tools available in market comes with its own challenges like Cost, Learning Curve and Workloads Support. Here in this talk we are going to demonstrate how we have built JAAS (JMeter As A Service) Distributed WorkLoad Testing solution using Containers and opensource tools and how this solution playing a crucial role in Delivering Servers Validation efforts.

Object Storage for Cloud Native Applications

Nicolas Trangez & Vianney Rancurel


We are now seeing the emergence of new generation application workloads in big data analytics, AI/ML, in-memory databases - and on the edge, that are creating new demands on storage. These applications are being deployed and managed on Kubernetes, with a need for fast and agile access to storage. Automated provisioning through CSI, the Container Storage Interface for block and file volumes is already being embraced. In this session, we will discuss the fit of object storage for unstructured data generated by these applications, and the emergence of object storage provisioning operators such as COSI, the Container Object Storage Interface from CNCF and Kubernetes’ SIG-Storage, as a way for simplifying and automating object storage provisioning.

Evaluating Cache performance using cloud storage traces

Effi Ofer


Designing a cache front end for cloud storage calls into question of the effectiveness of the popular LRU cache eviction policy versus the FIFO heuristic. Several past works have considered this question and commonly stipulated that while FIFO is much easier to implement, the improved hit ratio of LRU outweighs this. Two main trends call for a re-evaluation: the very large scales of cloud storage which makes managing cache metadata in RAM infeasible, and new workloads which possess different characteristics. We model the overall cost of running LRU and FIFO in a very large scale cache and evaluate this cost using traces taken from real world object store system. While there are lots of traces for file and block storage, to date, there has been a dearth of traces for Object Storage. To address this gap, we collected object storage traces from the IBM Public Cloud. IBM has anonymized these traces and made them available to the public at the SNIA IOTTA repository.

IO abort recovery using the IO Cancel command with NVMe over Fabrics

John Meneghini


This presentation will discuss the work done to improve the timeout-recovery-abort protocol in NVMe. Technical Proposal 4097 made changes to the NVMe Abort Command and added a new IO Cancel Command to the NVMe 1.4  and 2.0 specifications. Support for these changes are being implemented in Data ONTAP, Linux, ESX, and SPDK.  This presentation will include a short demonstration of a working IO Cancel command implementation using a Linux host with an Data ONTAP NVMe/FC controller.

Ceph RGW Message Queue API for Serverless Computing

Yuval Lifshitz & Dr Huamin Chen


A proposal to support AWS SQS API natively in Ceph RGW, for advanced Serverless computing use cases.

Bucket notification has become widespread in many applications including AI/ML and Edge Computing. We have demonstrated that Ceph Rados Gateway (RGW) bucket notifications trigger Serverless functions including Knative and AWS Lambda. Such workflows can be characterized as event push and event driven.

We’ve identified yet another workflow that allows Serverless computing frameworks to be more preemptive in auto scaling. The workflow is based on the event queue polling model, the opposite of event push, that allows Serverless computing frameworks, such as KEDA, to preemptively scale the functions based on the queue length. Such workflow is more advantageous and lightweighted in dealing with highly bursty event traffic and reducing Serverless function cold start overhead.

In order to support this workflow, we propose a new set of message queue APIs, modeled after AWS SQS, for Ceph RGW. In this talk, we’ll present the overview and planning of this technology to the community.

MASSé: A high-performance storage solution for 3D XPOINT™ and Flash SSDs

Jack Zhang


MASSé is the Media Aware Smart Storage Engine. While Solid-State-Drives (SSDs) have become commonplace in data-center and cloud, innovative new storage media technologies such as 3D XPOINT™ media or Intel® Optane™ media and Quad-level Cell (QLC) flash have emerged to address the ever-increasing need for better performance and lower total cost of ownership (TCO). Our observation is that existing storage algorithms and data structures - from application, through kernel, and to filesystem - have not evolved to take full advantage of the high-impact capabilities of these new storage media technologies. Consequently, applications are not fully benefitting from the perfor-mance capabilities of 3D XPOINT™ media, and the mi-gration from today’s mainstream TLC (Triple-level Cell) SSDs to low-cost QLC SSDs has been slowed. MASSé includes optimizations designed to take full advantage of the latest Solid-State-Drive media innovations, allowing applications to achieve better performance with 3D XPOINT™ media, as well as speed the migration to more cost-effective QLC SSDs. MASSé is lightweight open-source software that resides in user space, independent from applications and the kernel system. It can be inte-grated as a plug-in module into any storage system, ena-bling a tiered storage architecture that is more capable of delivering on the performance capabilities of 3D XPOINT™ media and the low-cost advantages of QLC SSDs.

Challenges of building an Application-Aware Regional Disaster Recovery Solution for Kubernetes

Veera Deenadhayalan & Shyam Ranganathan


Kubernetes does not offer a native disaster recovery solution that can handle disaster of an entire region. While Kubernetes can be deployed as a stretch cluster to tolerate failure of a data center or an availability zone, it is typically not stretched across regional boundaries due to the high latency between regions. This talk outlines the challenges of building an application-aware regional disaster recovery solution on the Kubernetes platform.

Fully Autonomous Storage and Memory Hierarchies

Irfan Ahmad


Today, storage and memory hierarchies are manually tuned and sized at design time. But tomorrow’s workloads are increasingly dynamic, multi-tenant and variable. Can we build autonomous storage systems that can adapt to changing application workloads?
In this session, we demonstrate how breakthroughs in autonomous storage systems research can deliver impressive gains in cost, performance, latency control and customer out-of-the-box experience. 

Status of changes from SNIA and intro to NVMe work

Eli Tiomkin


There has been a lot of movement since SDCs in 2020 around the computational storage market. With the addition of many new companies and viewports, we have been able to refine and redefine some of the parts of the Architectural work being done in the Computational Storage Technical Working Group (CS TWG). With that, this presentation will provide viewers with an update on the technology terms, the reasons for the change and how they improve the overall position of the technology in the market. We will leave the viewers with a preview of the latest version of the Architectural document and links and materials available to show the direction and growth of Computational Storage in the market.

Track 2 Abstracts

High-speed optimized SPDK implementation on a 80 cores manycore processor.

Jean-Francois Marie & Remy Gauguey


As you know, the Storage Performance Development Kit (SPDK) provides a set of tools and 
libraries for writing high performance, scalable, user-mode storage applications. 
Kalray’s MPPA® manycore architecture proposes a unique 80-cores system.
A manycore processor is characterized by an apparent grouping from a software point of view of cores 
and their portion of the memory hierarchy into computing units. This grouping can delimit the scope 
of cache consistency and inter-core synchronization operations, include explicitly addressed local working 
memories (as opposed to caches), or even specific data movement engines and other accelerators.
Computing units interact and access external memories and processor I/O through a communi¬cation device that 
can take the form of a network-on-chip (NoC).
The advantage of the manycore architecture is that a processor can scale to massive parallelism by replicating 
the computing units and extending the network on chip, whereas for a multi-core processor the replication 
applies to the core level.
For storage purposes, the internal processor clusters are configured with one dedicated cluster as a control 
and management plane, and the remaining four clusters as four independent data planes. 
We have implemented SPDK so that it provides a unique scalable platform that can deliver 
high performances on an 80-core system.
This presentation will explain how we have ported SPDK on our processor core, and what unique pieces of
technologies have been developed in order to coordinate with the processor internals.
We will also explain how the platform can scale.

Emerging 5G use cases impact on edge and Core Storage

Umang Kumar


The broad adoption of 5G, and its emerging use cases will reshape the nature and role of enterprise and cloud/edge storage over the next several years. The increased use of video data and the improved resolution of image sensors will have an significant impact on storage capacity. So too will the underlying virtualization of the infrastructure and the move towards cloud-native architectures. This presentation will try to identify all the emerging 5g use cases and how it will impact the edge and core storage. Some example use cases are :
Video surveillance
Connected car and autonomous vehicles
Cloud based gaming 

Presentation will also try to explore what are the best approaches for getting data from the Edge to the Cloud.

Long Term Retention for Medical AI Applications

Simona Rabinovici-Cohen


SNIA Self-contained Information Retention Format (SIRF) is an ISO/IEC 23681:2019 standard that defines a storage container for long term retention. SIRF enables future applications to interpret stored data regardless of the application that originally produced it. SIRF can be beneficial in domains that need to keep data for long periods and enable search, access, and analytics on that data in the far future. The standard also includes examples of SIRF serialization on the cloud and on tapes.

Artificial Intelligence (AI) in medical applications is an emerging area that aims to analyze medical big data to improve screening, diagnosis and treatment of disease. One challenge of this area is providing storage systems to efficiently store and preserve the data for future analysis within the patient lifetime and beyond. The European Union H2020 BigMedilytics project includes a pilot to analyze clinical and imaging data for early prediction of breast cancer treatment outcomes. In this talk, we’ll describe SIRF and how it can be applied in the BigMedilytics breast cancer pilot.

A Brief Introduction to Build an Efficient Blockchain

Dr Sweta Kumari & Dr Archit Somani


Blockchains like Bitcoin and Ethereum have become very popular. Due to their usefulness, they are now considered for many decentralized applications such as automating and securely storing user records for land sale documents, vehicle, and insurance. But developing an efficient blockchain is an uncharted territory challenge. So, in this session I will deliver the idea of how to build an efficient blockchain.

Open Data Framework for Hybrid Data Management

Sanil Kumar D


Data and Storage management across Edge, On-Premise, and Cloud is becoming a key requirement for any use case these days. This session will provide a brief introduction to Open Data Framework from SODA Foundation, which is a completely open-source solution and how it helps to address the key challenges of hybrid data management, cloud-native storage, and heterogeneous storage management. It will provide the overall architecture and key projects for different use cases of data management. 

Ozone, the story of a billion keys

Lokesh Jain


Object stores are known for ease of use and massive scalability. Unlike other storage solutions like file systems and block stores, object stores are capable of handling data growth without increase in complexity or developer intervention. Apache Hadoop Ozone is a highly scalable Object Store and is a spiritual successor of HDFS. It can store billions of keys and hundreds of petabytes of data. With the massive scale there is a requirement for it to have very high throughput while maintaining low latency.

This talk discusses the Ozone architecture and design decisions which were significant for achieving high throughput and low latency. With petabytes of data and billions of keys Ozone has a scalable metadata layer. The talk will detail how Ozone supports this layer without compromising throughput or latency. Such a massive scale requires Ozone to be scalable in terms of client connections and amount of data read and written to the store. The talk will discuss the challenges faced and the corresponding design solutions. Also it would touch upon Ozone’s goal of reaching trillion objects and possible challenges.

Micron demonstration of AI, ML and Data Science I/O acceleration using NVIDIA GPU Direct Storage technology

Or Lapid


Here at Micron, we've been analyzing the effect NVIDIA GPU Direct Storage (GDS) can have on storage I/O in GPU enabled system. In this session we discuss why this technology is important, the performance improvements from using GDS, and an exciting new architecture for datacenter NVMe-Ethernet-attached Bunch Of Flash (EBOF) for NVMe Over Fabrics (NVMe-oF).


Introducing Fabric Notifications, from awareness to action

Howard Johnson & AJ Casamento


Marginal links and congestion have plagued storage fabrics for years and many independent solutions have been tried. The Fibre Channel industry has been keenly aware of this issue and, over the course of the last two years, has created the architectural foundation for a common ecosystem solution. Fabric Notifications employs a simple message system to provide registered participants with information about key events in the fabric that are used to automatically address link integrity and congestion issues. This new technology has been embraced by the Fibre Channel community and has demonstrated a significant improvement in addressing the nagging issues. In this informative session, storage experts will discuss the evolution of this technology and how it is a step toward a truly autonomous SAN.

2-Dimensional Erasure Coded LTO Tapes with RAIL Library configuration

Turguy Goker


Active archives enable media companies to quickly, efficiently and cost-effectively search their vast volumes of content to retrieve and process assets they need. Erasure coding can be used to protect this content, but it typically requires more resources and increases content retrieval latencies, limiting performance. This paper proposes a technology that offers higher content durability and availability with minimum latencies and less storage overhead than a two-copy scheme. INSIC Tape Roadmap forecasts 35% Areal Density and 25% Track Density CAGR, requiring future track pitches to be below 1000nm, such as a recently published 580TB tape technology demo requiring 50nm track pitch. At these dimensions, considering that tape must operate in open environmental conditions, drives and media failures will be dominated by correlated errors requiring more protection than an internal format erasure correcting code (ECC) and two-copy scheme can provide to achieve more than 11-nines durability.

Current replication techniques don’t provide resource-friendly protection and can’t meet 11-nines durability unless more than two copies are used. In a typical tape archival scenario, a carefully designed multi-dimensional erasure coding mechanism can bring novel efficient resource utilization, better availability and durability, and more environment-friendly operation for distributed tape systems with less overhead than two-copy protection. To reduce repeated read cycles and I/O bandwidth requirements, increase durability, and improve local recoverability, we propose multi-dimensional, hierarchical overlay erasure coding with interleaved codewords. Its unique mathematical coding structure and special interleaving allows local repair and almost uncorrelated errors, helping maximize local tape operations - i.e. minimize the system’s tape, drive and robot requirements.

Thanks to special metadata handling, the proposed system’s byproducts are the self-describing erasure-coded tapes friendly to disaster recovery and content transportation. These combined contributions help increase the lifetime of different subsystems of tape and improve the archival storage system’s performance.

The vast spread of Computational Storage use cases

Eli Tiomkin


The vast spread of computational storage use cases continues to raise. NGD Systems, a leader in computational storage drive will go over the journey of computational storage usage cases, starting from the obvious, the cloud, and heading into new expending territories of edge compute, low earth orbits small satellites to crypto currency plotting. All have once common thread of how can computational storage increases efficiency by reducing data movement.


EDSFF for Storage, Memory and Acceleration

Pekon Gupta


Challenges facing system designers, mother board designers are: Space, Performance and Cost. Increasing the data-rate of electrical interfaces on PCB, increases cost of manufacturing and add to other engineering challenges. Dedicated pins on the CPU sockets for DDR like interfaces are limited.
EDSFF standard SFF-TA-1002, 1009 standardize the electrical and mechanical outline of devices. This ensures same server chassis can support combination of profiles from different vendors. Individual form-factor spec like 1006 and 1008 define power, thermal budget, and airflow conditions enforcing commonality between modules from various vendors.

CXL™ 2.0: A High-Speed Interconnect for Persistent Memory Challenges

Andy Rudoff


Compute Express Link™ (CXL™) is a high-speed CPU-to-Device and CPU-to-Memory interconnect designed to accelerate next-generation data center performance. CXL is an open industry-standard interconnect offering coherency and memory semantics using high-bandwidth, low-latency connectivity between host processor and devices such as accelerators, memory buffers, and smart I/O devices. CXL technology is designed to address the growing needs of high-performance computational workloads by supporting heterogeneous processing and memory systems for applications in Artificial Intelligence, Machine Learning, communication systems, and high-performance computing. 
Released in November 2020, the CXL 2.0 Specification adds new features – including support for switching, memory pooling for increased memory utilization efficiency, and persistent memory – all while maintaining full backwards compatibility with CXL 1.1 and 1.0. CXL 2.0 provides standardized management of the persistent memory interface and enables simultaneous operation alongside DDR, freeing up DDR for other uses. In this presentation, attendees will learn how CXL 2.0 supports persistent memory, CXL use cases, and what’s ahead for the Consortium.




Microsoft File Server Protocol Test Suites Overview and Updates

Mengyan (Helen) Lu and Obaro Ogbo


In this talk, we’ll cover the latest updates of the Microsoft Protocol Test Suites for File Services. Microsoft Protocol Test Suites are a group of tools that were originally developed for in-house testing of the Microsoft Open Specifications. Microsoft Protocol Test Suites have been used extensively during Plugfests and Interoperability (IO) Labs to test against partner implementations.

Learning Objectives: How do the Test Suites work and what scenarios do they cover? How have the Test Suites changed over the past year? What new testing scenarios are covered for File Services?

Improvements to Accessing Network Files from Linux: SMB3 client progress

Steve French


The Linux SMB3 client continues to be the most active network/cluster filesystem on Linux over the past year, and the progress on Samba server and the Linux kernel server has helped add new features to the SMB3.1.1 client in Linux, and made these improvements even more important.

It has been a great year for SMB with the addition of many security improvements, many performance improvements including to caching and RDMA (smbdirect) as well as dramatic improvements to multichannel, and new sparse file features. Support for the Witness protocol (allowing transparent movement to a different server) has been added, as well as the new more feature rich Linux mount API. In addition support for the final piece of the optional SMB 3.1.1 POSIX protocol extensions was completed. Tooling has been improved with many new features added to tools like smbinfo, and support for easily getting and setting more auditing and security information.

This presentation will go through some of the new features added to the Linux client over the past year, and demonstrate the great progress in access various types of network storage, including the cloud (e.g. Azure), and Samba and Windows server and also the new Linux kernel server (ksmbd).

ksmbd(cifsd) status update

Namjae Jeon


ksmbd(cifsd) is a new SMB3 kernel server which implements server-side SMB3 protocol. The target is to provide optimized performance, GPLv2 SMB server, better lease handling (distributed caching). The bigger goal is to add new features more rapidly which are easier to develop on a smaller, more tightly optimized kernel server. Many changes and improvements have been made since cifsd(ksmbd) was introduced to earlier SDC 2019. This talk will give ksmbd overview and the current status update.

Access control and ID mapping on the Linux SMB Client

Shyam Prasad


The SMB protocol was designed long after Unix was created, and as a result supported concepts like globally unique identities and rich ACLs that are in Windows, but not in Linux. User identity and access control are very relevant to the Linux SMB3 client, as it acts as a bridge between the world of Windows-like-filesystems (including the cloud) and the world of Linux filesystems, and has the hard task of translating security information from the more complex Samba and Windows world, to the simpler Linux/POSIX model.

There are three key problems:
Id-mapping: Who the user is? And how does it map to the user that the server understands?
Authentication: Can the user prove his/her identity?
Access control: What permissions does the user have for this file?

This talk will discuss and demonstrate the different ways that the Linux client can be configured to map POSIX permissions (mode bits) to ACLs, and the implications of using these configurations. It will discuss the different authentication choices, especially how to leverage Samba’s winbind for easy to use and highly secure Kerberos authentication and key refresh. In addition it will discuss how to integrate with Samba’s winbind to map user identities (from the local Linux client’s UIDs to globally unique SIDs) and the various alternatives like “idsfromsid”. Recent improvements in cifs-utils for managing ACLs and auditing information remotely will also be discussed, which can make managing Samba server easier in some cases.

The network is the computer revisited

Christopher Hertel


Under the raised floor tiles of data centers around the world you will find tangles of multi-colored cables and strange, other-worldly interconnects. The swill from the overflowing bitbuckets seeps its way down to this level, where even DevOps fear to venture. In the darkness of these digital depths, strange new creatures evolve. The new new breed of SmartNICs is an example. SmartNICs are similar to the familiar TOE and iSCSI cards, but the latest wave of evolution are based upon multi-core RISC-based I/O processors known as Data Processing Units, or DPUs.

This talk will cover a little project that started up just about a year ago. The Zambezi project aims to create an SMB3 Offload Engine, aimed at DPUs and SmartNICs. We will look at SMB3 message processing from the Syntactic to the Semantic layers and discuss how this processing can be broken down into layers so that much (if not all) of it can be offloaded to (general case) SmartNIC or other network infrastructure device.