2020 SDC India Abstracts

2020 SDC India Abstracts

Plenary
Day One Track 1
Day One Track 2
Day Two Track 1
Day Two Track 2

Plenary Abstracts

Looking forward as compute and storage merge

Jim Handy

Abstract

In the recent past computer architecture has begun on a long journey that will cause a merger of processing and storage, which have historically been separate. This will accelerate data processing while helping to reduce cost, all the benefit of cost/performance statistics. This presentation will show how today’s persistent memory is just the beginning of a trend to bring persistence deeper into the processor, first in cache memories, followed by persistent registers within the processor itself. At the other end of the spectrum we will see how computational storage offloads processor tasks to lighten both processor load and network traffic and learn why this technique makes systems scale in a significantly more linear way. We will explore other areas that could benefit from this approach but where it has not yet been tried.

Storage Industry: The changing paradigm leading to Computational Storage

Rohit Srivastava

Abstract

The Storage industry has been witnessing a paradigm shift with the emergence of new and disruptive technologies. In the past three years, the Storage space has witnessed a spurt in areas like all-flash & NVMe, Predictive Storage Analysis, Cloud Storage, Data Protection, IoT Analytics, Computational Storage, and AI-driven Storage. But where did all this begin? It all started with magnetic disks coming to the fore for filesystems to manage data and deliver performance. SAN brought with it flexibility, sharing, and improved performance. This was followed by ‘Shared Nothing’, which came in with a purpose to divide & conquer, and also to enable parallel processing.

Solid State Drives (SSDs) made an appearance to help cut down service times considerably. By now, the CPUs and buses have experienced a manifold increase in speed. With great development, come greater challenges. The challenge faced by technologists was cutting down the overheads in the IO subsystems and the various kernel processes involved in servicing these IOs. One thing led to the other, and now we have NVMe.

To boost storage performance and exploit the might of NVMe, a mechanism was devised to couple an application directly with the storage subsystem, bypassing the kernel. This led to concerns for data transfer times and hence the initiative to club Compute and Storage together as a unit – resulting in the birth of Computational Storage – the great hybrid with a massive potential to disrupt the market.

PCIe Gen 5 Testing

Manoj Vaddineni

Abstract

This presentation will give an introduction and adoption overview of PCIe Gen 5 as well as its technology update. Test scenarios and solutions will also be covered.

The story of hybrid data management across heterogeneous storage for cloud, core and edge

Sanil Kumar D and Ashit Kumar

Abstract

Be it any technologies; Be it cloud, edge or on-premise, Data and Data Management are essential services for any solution today. However the challenges due to new technologies, heterogeneous storages, platforms and hybrid environments are mounting. How can we meet these challenges and build a unified data platform that could help us focus on the business solutions without worrying about the data and data management? What are we dreaming and developing under SODA Foundation projects in open source under Linux Foundation?

We will tell you the story of data, challenges, current demands and an open unified data platform which can provide seamless data and storage management across heterogeneous platforms, storages and across cloud, edge and core. We will have live demos to illustrate some of these solutions working to resolve these hybrid/heterogeneous challenges.

We will also discuss the architecture, technical solutions and future plans towards open data autonomy.

Data Mobility Trends

Prakash Venkat

Abstract

Data Mobility includes a wide range of features ranging from replication, disaster recovery, migration, load balancing along with backup, cloud tiering, archival and in some sense, it involves the entire life cycle of data. Data Mobility is not only concerned about the data being available at the right time/place, but also needs to handle other aspects like security and data integrity, to name a couple. This presentation attempts to capture the broad trends in this space, with a skew towards how Storage Industry is adapting to this feature set, which is also constantly evolving to keep pace with technology advancements.

Next Generation Data Management Capabilities

Roopesh Chuggani

Abstract

Data is growing exponentially everywhere. With the growth of data, data management is becoming super-critical. All enterprises (small, medium or large) are investing heavily towards managing data at the source of data creation as well as where it is later moved to. Managing the data at the same time processing it with speed is intended by all applications. In this talk we will touch upon data management on the edge devices (where the data gets created), as well as later when it is moved and processed for some business purpose. The world is moving towards every device intelligent enough to create, store, manage, process and move the data across the network of devices or to cloud; we will talk about the next generation capabilities from data management perspective.

DAY ONE TRACK 1 ABSTRACTS

SPDK (Storage Performance Development Kit) to accelerate NVMe drives

Niranjan R Nilugal

Abstract

With the improvement of technology and the reduction of cost and power consumption, SSD (Solid State Drive) has developed rapidly as storage media in recent years. However, the general NVMe protocol requires frequent data exchange between user mode and kernel mode by “interrupts” when processing each IO. The entire process involves multiple CPU context switches and memory data copies. This method is too outdated and inefficient to give full play to the SSD hardware performance, resulting in waste of storage resources. In order to make better use of SSD performance, we need to adopt high-performance storage kit SPDK, using the network, computing processing capability and storage technology to realize the full potential of solid-state storage. SPDK provides a set of tools and libraries for writing high performance, scalable, user mode storage applications. The bedrock of SPDK is user space, polled mode, asynchronous, lockless NVMe driver. SPDK optimizes CPU/NVMe SSD’s/NIC’s to the fullest extent possible thus providing high performance with low cost thus helping upper layer applications to make full use of it by NVMe SSD’s. Also, it empowers lower latency with zero cost increase. SPDK achieves high performance using several key techniques:

Moving all the necessary drivers into userspace, which avoids syscalls and enables zero-copy access from the application.
Polling hardware for completions instead of relying on interrupts, which lowers both total latency and latency variance.
Avoiding all locks in the I/O path, instead relying on message passing.presentation is about development of storage

DAY ONE TRACK 2 ABSTRACTS

Deploying Computational Storage at scale with ease

Scott Shadley

Abstract

In this presentation, participants will learn the basic building blocks of Computational Storage.

We will show use cases of AI, ML, and edge deployments and cover all aspects of storage use cases.

Learning Outcomes

1. Computational Storage Overview

2. Use cases of Computational Storage

3. Traction in the Market and how to Deploy

Day Two Track 1 Abstracts

The future of accessing files remotely from Linux: SMB 3.1.1 update

Steven French and Shyam Prasad

Abstract

Enhancements to the SMB3.1.1 client on Linux have continued at a rapid pace over the past year. These allow Linux to better access Samba server, as well as the Cloud (Azure), NAS appliances, Windows systems, Macs and an ever increasing number of embedded Linux devices including those using the new smb3 kernel server Linux (ksmbd). The SMB3.1.1 client for Linux (cifs.ko) continues to be one of the most actively developed file systems on Linux and these improvements have made it possible to run additional workloads remotely.

The exciting recent addition of the new kernel server (ksmbd) also allows more rapid development and testing of optimizations for Linux.

Over the past year ...

* performance has dramatically improved with features like multichannel (allowing better parallelization of i/o and also utilization of multiple network devices simultaneously), with much faster encryption and now much faster signing as well, with better use of compounding and with improved support for RDMA, and improved caching including extended use of directory leases

* security has improved with support for the strongest encryption, and alternative security models are now possible with the addition of modefromsid and idsfromsid

* multiuser scenarios and integration with Kerberos and ActiveDirectory has improved

* new features have been added include the ability to swap over SMB3 and boot over SMB3

( quality continues to improve with more work on 'xfstests' and test automation

* tooling (cifs-utils) continue to be extended to make use of SMB3.1.1 mounts easier

This presentation will describe and demonstrate the progress that has been made over the past year in the Linux kernel client in accessing servers using the SMB3.1.1 family of protocols. In addition recommendations on common configuration choices, and troubleshooting techniques will be discussed. With the exciting addition of a Linux kernel server for Linux, we will also discuss some additional workloads that this and Samba make possible.

Learning Outcomes

1. What new SMB3 features for accessing servers from Linux are now available? and which ones are near completion and expected soon?

2. How can I configure the security settings to use SMB3.1.1 optimally for my workload?

3. How can I configure the SMB3.1.1 client optimally for the performance required for my workload?

Day Two TRACK 2 ABSTRACTS

NAS File System: Panacea for extreme performance at high bandwidth to high IOPS workloads

Ankit Mathur

Abstract

The Network Attached Storage systems have traditionally taken a call to either excelling at handling high IOPS workloads with scaled up systems or win at the high throughput game with scaled out distributed systems. The high IOPS flow may need to deal with lots of metadata operations with small random reads and writes. The high throughput flows usually deal with lots of large sequential reads and writes and far fewer metadata operations. The big names in each field have tried to make a less than successful attempt at making a dent into the other type of market. So, when we set out to build the next generation stack to make a foray into NAS, we wanted the architecture to not only fit well into both the high IOPS and high throughput workloads but also excel at them. This talk will review our recent work and experiences as we shaped our architecture to build this next generation NAS, that could be our panacea to solve the extreme needs, with one architecture.