Storage Developer Conference Abstracts

Break Out Sessions and Agenda Tracks Include:

Note: This agenda is a work in progress. Check back for updates on additional sessions as well as the agenda schedule.

File Systems

MarFS: Near-POSIX Access to Object-Storage

Jeff Inman, Software Developer, Los Alamos National Laboratory
Gary Grider, HPC Div Director, LANL

Abstract

Many computing sites need long-term retention of mostly-cold data, often referred to as “data lakes”. The main function of this storage-tier is capacity, but non-trivial bandwidth/access requirements may also exist. For many years, tape was the most economical solution. However, data sets have grown larger more quickly than tape bandwidth has improved, such that disk is now becoming more economically feasible for this storage-tier. MarFS is a Near-POSIX File System, storing metadata across multiple POSIX file systems for scalable parallel access, while storing data in industry-standard erasure-protected cloud-style object stores for space-efficient reliability and massive parallelism. Our presentation will include: Cost-modeling of disk versus tape for campaign storage, Challenges of presenting object-storage through POSIX file semantics, Scalable Parallel metadata operations and bandwidth, Scaling metadata structures to handle trillions of files with billions of files/directory, Alternative technologies for data/metadata, and Structure of the MarFS solution.

Green Storage

Using SPEC SFS with the SNIA Emerald Program for EPA Energy Star Data Center Storage Program

Vernon Miller, Senior Software Engineer, IBM
Nick Principe, Principal Software Engineer, EMC

Abstract

The next storage platform category to be added into the EPA Data Center Storage program is File storage. Come learn what it takes in setting up a SNIA Emerald File testing environment with the SPEC SFS tools, the additional energy related instrumentation and data collection tools. Don't wait to be kicked in the "NAS" when an Energy Star rating gates selling your File storage solutions.

Learning Objectives

  • Understand how the EPA Energy Star Data Center Storage Program applies to file and NAS environments.
  • Understand the methodology used in the SNIA Emerald Program to evaluate energy consumption of file and NAS environments.
  • Understand how the SNIA Emerald Program uses SPEC SFS2014 to drive file and NAS workloads.

KEYNOTE SPEAKERS AND GENERAL SESSIONS

Cloud Architecture in the Data Center and the Impact on the Storage Ecosystem: A Journey Down the Software Defined Storage Rabbit Hole

Dan Maslowski, Global Engineering Head, Citi Storage/Engineered Systems - Citi Architecture and Technology Engineering (CATE), Citigroup

Abstract

Linux is usually at the edge of implementing new storage standards, and NVMe over Fabrics is no different in this regard. This presentation gives an overview of the Linux NVMe over Fabrics implementation on the host and target sides, highlighting how it influenced the design of the protocol by early prototyping feedback. It also tells how the lessons learned during developing the NVMe over Fabrics, and how they helped reshaping parts of the Linux kernel to support over Fabrics and other storage protocols better.

Networking

NVMe Over Fabrics Support in Linux

Christoph Hellwig

Abstract

Linux is usually at the edge of implementing new storage standards, and NVMe over Fabrics is no different in this regard. This presentation gives an overview of the Linux NVMe over Fabrics implementation on the host and target sides, highlighting how it influenced the design of the protocol by early prototyping feedback. It also tells how the lessons learned during developing the NVMe over Fabrics, and how they helped reshaping parts of the Linux kernel to support over Fabrics and other storage protocols better.

Object Storage

Hardware Based Compression in Ceph OSD with BTRFS

Weigang Li, Software Engineer, Intel
Anjaneya (Reddy) Chagam, Principle Engineer, Intel

Abstract

Ceph is a distributed object store and file system designed to provide excellent performance, reliability and scalability. BTRFS with the compelling set of features is recommended for non-production Ceph environments. In this talk we will introduce our experiment work on integrating the hardware acceleration in BTRFS to optimize the data compression workload in Ceph OSD, we analyze the nature of compression feature in BTRFS file system and the cost of the software compression library, and present the optimized solution to reduce the CPU cycles, disk IO with the hardware compression accelerator enabled in Ceph OSD.

Learning Objectives

  • BTRFS compression architecture
  • Hardware based compression acceleration integration with BTRFS
  • Ceph performance improvement with the hardware acceleration

Object Storage Analytics : Leveraging Cognitive Computing For Deriving Insights And Relationships

Sandeep Patil, Master Inventor, IBM
Pushkar Thorat, Software Engineer, IBM

Abstract

Object storage has become a de facto cloud storage for both private and public cloud deployment. Analytics over data stored on object store for deriving greater insights is an obvious exercise being looked by implementers. In object store where the data resides within objects, the user-defined metadata associated with the objects has the ability to provide quick relevant insights of the data. Hence leveraging user defined object metadata for analytics can help derive early insights. But having relevant user defined metadata with every object data is one of the biggest inhibitors for such analytics. On the other hand, Conginitive Computing has been an up trend where fiction meets reality. Various cognitive services are available which leverage extreme data analysis using machine learning techniques that help in data interpretation and beyond. In this presentation, we discuss on how cognitive services can help enrich object stores for analytics by self tagging objects which can not only be used for data analytics but also for deriving object relationships to helps short-list & categorize the data for analytics. The presentation includes a manifestation using IBM Spectrum Scale Object Store based on OpenStack SWIFT and popular cognitive service in marketplace like IBM Watson.

Learning Objectives

  • One of the objective is to understand how to analyze data on object store
  • Audience will learn about cognitive computation and how it relates to analytics
  • Audience will be able to understand how the emerging cognitive services can be applied to object store for better analysis of data hosted on object store
  • Audience will be able learn practically how they one can apply cognitive services for media based workloads hosted on object store to derive better insights

Fun with Linearity: How Encryption and Erasure Codes are Intimately Related

Jason Resch, Senior Software Architect, IBM

Abstract

Erasure codes are a common means to achieve availability within storage systems. Encryption, on the other hand, is used to achieve security for that same data. Despite the widespread use of both methods together, it remains little known that both of these functions are linear transformations of the data. This relation allows for them to be combined in useful ways. Ways that are seemingly unknown and unused in practice. This presentation presents novel techniques built on this observation, including: rebuilding lost erasure code fragments without exposing any information, decrypting fragments produced from encrypted source data, and verifying consistency and integrity of erasure coded fragments without exposing any information about the fragments or the data.

Learning Objectives

  • What are linear functions?
  • Examples of linear functions
  • Combining encryption and erasure codes
  • Exploiting the linearity of erasure codes to securely rebuild
  • Using the linearity of CRCs to securely verify erasure coded data

Storage Solutions for Private Cloud, Object Storage Implementation on Private Cloud, MAS/ACS

Ali Turkoglu, Principle Software Engineer Manager, Microsoft
Mallikarjun Chadalapaka, Principle PM, Microsoft

Abstract

In MAS (Microsoft Azure Stack), we implemented a scalable object storage on unified namespace leveraging Windows Server 2016 features in file system and in places creating new storage back ends for Azure block and page blobs, as well as table and queue services. This talk gives a details about capabilities of private cloud object storage and key design/architectural principles and problems being faced.

Learning Objectives

  • Private cloud object storage
  • Block and page blob design challenges
  • Microsoft Azure Stack

Open Source

Corporate/Open Source Community Relationships: The OpenZFS Example

Michael Dexter, Senior Analyst, iXsystems, Inc.

Abstract

Corporations and the global Open Source community have had a colorful relationship over the decades with each group struggling to understand the priorities, abilities and role of the other. Such relationships have ranged from hostile to prodigiously fruitful and have clearly resulted in the adoption of Open Source in virtually every aspect of computing. This talk will explore the qualities and precedences of strong Corporate/Open Source relationships and focus on the success of the OpenZFS enterprise file system project as a benchmark of contemporary success. I will explore:

  • Historical and contemporary corporate/open source relationship precedences
  • Corporation/Project non-profit foundation relationships
  • Pragmatic project collaboration and event participation strategies
  • Motivations for relationship building

 

Learning Objectives

  • How do I work with the Open Source community?
  • What organizations can I turn to for guidance and participation?
  • What tangible resources can the community provide?

Openstack

Introduction to OpenStack Cinder

Sean McGinnis, Sr. Principal Software Engineer, Dell
Walter Boring IV, Software Engineer, HPE

Abstract

Cinder is the block storage management service for OpenStack. Cinder allows provisioning iSCSI, fibre channel, and remote storage services to attach to your cloud instances. LVM, Ceph, and other external storage devices can be managed and consumed through the use of configurable backend storage drivers. Led by some of the core members of Cinder, this session will provide an introduction to the block storage services in OpenStack as well as give an overview of the Cinder project itself. Whether you are looking for more information on how to use block storage in OpenStack, are looking to get involved in an open source project, or are just curious about how storage fits into the cloud, this session will provide a starting point to get going.

Learning Objectives

  • Cloud storage management
  • OpenStack storage

Performance

IOPS: Changing Needs

Jim Handy, General Director, Objective Analysis
Thomas Coughlin, President, Coughlin Associates

Abstract

Four years have elapsed since our first IOPS survey – What has changed? Since 2012 we have been surveying SysAdmins and other IT professionals to ask how many IOPS they need and what latency they require. Things have changed over the past four years. Everyone understands that SSDs can give them thousands to hundreds of thousands of IOPS (I/Os Per Second), with flash arrays offering numbers in the million-IOPS range, while HDDs only support from tens to hundreds of IOPS. But many applications don’t need the extreme performance of high-end SSDs. In the survey users shared their IOPS needs with us both in 2012, and again in 2016. What has changed? What has remained the same? This presentation examines the need for high IOPS and profiles applications according to these needs.

Learning Objectives

  • Hear what your peers consider the necessary level of performance in both IOPS and latency for various common enterprise applications
  • Learn how the proper combination of HDDs and SSDS can satisfy IOPS and latency requirements for common enterprise activities
  • See some examples of how users have combined HDDs and flash memory to achieve cost effective solutions that meet their application requirements.

SPDK - Building Blocks for Scalable, High Performance Storage Applications

Benjamin Walker, Software Engineer, Intel Corporation

Abstract

Significant advances in throughput and latency for non-volatile media and networking require scalable and efficient software to capitalize on these advancements. This session will present an overview of the Storage Performance Development Kit, an open source software project dedicated to providing building blocks for scalable and efficient storage applications with breakthrough performance. There will be a focus on the motivations behind SPDK's userspace, polled-mode model, as well as details on the SPDK NVMe, CB-DMA, NVMe over Fabrics and iSCSI building blocks. http://spdk.io.

Learning Objectives

  • Why use userspace drivers
  • When polling is better than interrupts
  • Applying shared-nothing architecture to storage

Persistent Memory

Enabling Remote Access to Persistent Memory on an IO Subsystem using NVM Express and RDMA

Stephen Bates, Sr. Technical Director, Microsemi

Abstract

NVM Express is predominately a block based protocol where data is transferred to/from host memory using DMA engines inside the PCIe SSD. However since NVMe 1.2 there exists a memory access method called a Controller Memory Buffer which can be thought of as a PCIe BAR managed by the NVMe driver. Also, the NVMe over Fabrics standard was released this year that extended NVMe over RDMA transports. In this paper we look at the performance of the CMB access methodology over RDMA networks. We discuss the implications of adding persistence semantics to both RDMA and these NVMe CMBs to enable a new type of NVDIMM (which we refer to as an NVRAM). These NVRAMs can reside on the IO sub-system and hence are decoupled from the CPU and memory sub-system which has certain advantages and disadvantages over NVDIMM which we outline in the paper. We conclude with a discussion on how NVMe over Fabrics might evolve to support this access methodology and how the RDMA standard is also developing to align with this work.

Learning Objectives

  • NVM Express overview
  • NVMe over Fabrics overview
  • Using IO memory as an alternative to NVDIMM
  • Controller Memory Buffers in NVMe
  • Performance results for new access method

RDMA Extensions for Accelerating Remote PMEM Access - HW and SW Considerations, Architecture, and Programming Models

Chet Douglas, Principal SW Engineer, Intel Corporation
Raj Ramanujan, Sr. Principal Engineer, Intel Corporation

Abstract

The SNIA Nonvolatile Memory Programing Model defines recommendations on how NVM behavior can be exposed to application software. There are Linux NVM library implementations that provide a rich set of interfaces which can be used by applications to improve their performance drastically on systems with NVM.

We analyzed these available interfaces and how it can be leveraged in typical applications. Our work demonstrates how a sample OpenSource Linux application can make use of these interfaces for improving performance. We also give examples of analyzing application code and finding out opportunity to use NVMp style interfaces. The sample application we would discuss is Linux Sqllite, and found 9 such opportunities to use Linux NVMP compatible interfaces. We also show how these storage optimizations can improve overall I/O and performance of such applications.

Learning Objectives

  • Introduce HW Architecture concepts of Intel platforms that will affect RDMA usages with PM
  • Introduce proposed high-level HW modifications that can be utilized to provide native HW support for pmem, reduce RDMA latency, and improve RDMA with pmem bandwidth
  • Focused discussion on proposed Linux libfabric and libibverb interface extensions and modification to support the proposed HW extensions Discuss open architecture issues and limitations with the proposed HW and SW extensions
  • Discuss Intel plans for standardization and industry review

Experience and Lessons from Accelerating Linux Application Performance Using NVMp

Vinod Eswaraprasad, Chief Architect, Wipro Technologies

Abstract

The SNIA Nonvolatile Memory Programing Model defines recommendations on how NVM behavior can be exposed to application software. There are Linux NVM library implementations that provide a rich set of interfaces which can be used by applications to improve their performance drastically on systems with NVM.

We analyzed these available interfaces and how it can be leveraged in typical applications. Our work demonstrates how a sample OpenSource Linux application can make use of these interfaces for improving performance. We also give examples of analyzing application code and finding out opportunity to use NVMp style interfaces. The sample application we would discuss is Linux Sqllite, and found 9 such opportunities to use Linux NVMP compatible interfaces. We also show how these storage optimizations can improve overall I/O and performance of such applications.

Learning Objectives

  • NVMp - programming model and use cases
  • Linux NVMP implementations
  • Analyzing application for usage of NVM
  • Measure performance improvements in application - with specific workload focusing on I/O

Breaking Barriers: Making Adoption of Persistent Memory Easier

Sarah Jelinek, SW Architect, Intel Corp

Abstract

One of the major barriers to adoption of persistent memory is preparing applications to make use of it's direct access capabilities. This presentation will discuss a new user space file system for persistent memory and how it breaks these barriers. The presentation will introduce the key elements to consider for a user space persistent memory file system and discuss the internals of this new file system. The discussion will conclude with a presentation of current status and performance of this new persistent memory file system.

Learning Objectives

  • Discussion of current barriers to persistent memory adoption.
  • Introduce how this new file system breaks down the barriers to adoption of persistent memory.
  • Introduce the SW internals of the this file system.
  • Present performance statistics and discussion of why this file system out-performs conventional, kernel based file systems.

Building on The NVM Programming Model – A Windows Implementation

Paul Luse, Principal Engineer, Intel
Chandra Kumar Konamki, Sr Software Engineer, Microsoft

Abstract

In July 2012 the SNIA NVM Programming Model TWG was formed with just 10 participating companies who set out to create specifications to provide guidance for operating system, device driver, and application developers on a consistent programming model for next generation non-volatile memory technologies. To date, membership in the TWG has grown to over 50 companies and the group has published multiple revisions of The NVM Programming Model. Intel and Microsoft have been long time key contributors in the TWG and we are now seeing both Linux and Windows adopt this model in their latest storage stacks. Building the complete ecosystem requires more than just core OS enablement though; Intel has put considerable time and effort into a Linux based library, NVML, that adds value in multiple dimensions for applications wanting to take advantage of persistent byte addressable memory from user space. Now, along with Intel and HPE, Microsoft is moving forward with its efforts to further promote this library by providing a Windows implementation with a matching API. In this session you will learn the fundamentals of the programming model, the basics of the NVML library and get the latest information on the Microsoft implementation of this library. We will cover both available features/functions and timelines as well as provide some insight into how the open source project went from idea to reality with great contributions from multiple companies.

Learning Objectives

  • NVM Programming Model Basics
  • The NVM Libraries (NVML)
  • The Windows Porting Effort

Low Latency Remote Storage - a Full-stack View

Tom Talpey, Architect, Microsoft

Abstract

A new class of ultra low latency remote storage is emerging - nonvolatile memory technology can be accessed remotely via high performance storage protocols such as SMB3, over high performance interconnects such as RDMA. A new ecosystem is emerging to "light up" this access end-to-end. This presentation will explore one path to achieve it, with performance data on current approaches, analysis of the overheads, and finally the expectation with simple extensions to well-established protocols.

Learning Objectives

  • Understand the potential for low latency remote storage
  • Survey the protocols and interfaces in use today
  • See current performance data, and future performance expectations
  • See a view of the future of the end-to-end storage revolution

The SNIA NVM Programming Model

Doug Voigt, Distinguished Technologist, Hewlett Packard Enterprise

Abstract

The SNIA NVM Programming model enables applications to consume emerging persistent memory technologies through step-wise evolution to greater and greater value. Starting with an overview of the latest revision of the NVM programming model specification this session summarizes the recent work of the NVM programming TWG in areas of high availability and atomicity. We take an application view of ongoing technical innovation in a persistent memory ecosystem.

Learning Objectives

  • Learn what the SNIA NVM programming TWG has been working on.
  • Learn how applications can move incrementally towards greater and greater benefit from persistent memory.
  • Learn about the resources available to help developers plan and implement persistent memory aware software.

SMB

SMB3 and Linux - A Seamless File Sharing Protocol

Jeremy Allison, Engineer, Samba Team/Google

Abstract

SMB3 is the default Windows and MacOS X file sharing protocol, but what about making it the default on Linux ? After developing the UNIX extensions to the SMB1 protocol, the Samba developers are planning to add UNIX extensions to SMB3 also. Co-creator of Samba Jeremy Allison will discuss the technical challenges faced in making SMB3 into a seamless file sharing protocol between Linux clients and Samba servers, and how Samba plans to address them. Come learn how Samba plans to make NFS obsolete (again :-) !

Learning Objectives

  • SMB3
  • Linux
  • Windows interoperability

SMR

ZDM: Using an STL for Zoned Media on Linux

Shaun Tancheff, Software Engineer, AeonAzure LLC

Abstract

As zoned block devices supporting the T10 ZBC and T13 ZAC specifications are becoming available there few strategies for putting these low TCO ($/TB and watts/TB) drives into existing storage clusters with minimal changes to the existing software stack.

ZDM is a Linux device mapper target for zoned media that provides a Shingled Translation Layer (STL) to support a normal block interface at near normal performance for certain use cases.

ZDM-Device-Mapper is an open source project available on github with a goal of being upstreamed to the kernel.

Learning Objectives

  • ZDM performance compared to existing disc drive options
  • How to determine if your workload is a likely candidate for using ZDM.
  • How to use and tune ZDM effectively for your workload.
  • Using ZDM in cloud storage environments.
  • Using ZDM in low resource embedded NAS / mdraid.

Solid State Storage

Standards for Improving SSD Performance and Endurance

Bill Martin, Principal Engineer Storage Standards, Samsung

Abstract

Standardization efforts have continued for features to improve SSD performance and endurance. NVMe, SCSI, and SATA have completed standardization of streams and background operation control. Standardization is beginning on how SSDs may implement Key Value Storage (KVS) and In Storage Compute (ISC). This effort is progressing in the SNIA Object Drive Technical Work Group (TWG), and may involve future work in the protocol standards (NVMe, SCSI, and SATA). Currently the Object Drive TWG is defining IP based management for object drives in utilizing the DMTF RedFish objects. Future Object Drive TWG work will include APIs for KVS and ISC. This presentation will discuss the standardization of streams, background operation control, KVS, and ISC and how each of these features work.

Learning Objectives

  • What is streams
  • What is background operation control
  • Progress of standardization in NVMe, SCSI, SATA, and SNIA

Storage Architecture

Heterogeneous Architectures for Implementation of High-capacity Hyper-converged Storage Devices

Michaela Blott, Principal Engineer, Xilinx/Research
Endric Schubert, PhD CTO & Founder, MLE

Abstract

Latest trends in software defined storage indicate emergence of hyper-convergence where compute networking and storage are combined within one device. In this talk, we introduce a novel architecture to implement such a node in one device. The unique features include a combination of ARM-processors for control plane functionality and dataflow architectures in FPGAs to handle data processing. We leverage a novel hybrid memory system mixing NVMe drives and DRAM to deliver a multi-terabyte object store with 10Gbps access bandwidth. Finally, network connectivity is accelerated by leveraging a full TCP/IP endpoint dataflow implementation within the FPGA’s programmable fabric. A first proof-of-concept deploys Xilinx Ultrascale+ MPSoC to demonstrate feasibility of a single-chip solution that can produce unprecedented levels of performance (13M requests per second) and storage capacity (2TB) at minimal power consumption (<30W). Furthermore, the deployed dataflow architecture supports additional software-defined features such as video compression and object recognition without performance penalty while resources fit in the device.


The Magnetic Hard Disk Drive Today’s Technical Status and Its Future

Edward Grochowski, Consultant, Memory/Storage
Peter Goglia, President, VeriTekk Solutions

Abstract

The ubiquitous magnetic hard disk drive continues to occupy a principal role in all storage applications, shipping more bytes than all other competing product technologies. The emergence of cloud computing has firmly established the future HDD products well into the next decade. This presentation will discuss today’s HDD products, their technical characteristics, as; performance, reliability and endurance, capacity as well as cost per gigabyte. The technologies which enable these properties as; form factor, interface, shingle write, helium ambient, two dimensional magnetic recording (TDMR) and yet to be implemented heat assisted magnetic recording (HAMR), will be detailed. A projection of disk drives of the future will be made and their competiveness to flash as well as emerging non volatile memories (NVM) will be discussed.


What One Billion Hours of Spinning Hard Drives Can Tell Us?

Andrew Klein, Director, Product Marketing, Backblaze
Gleb Budman, CEO, Backblaze Inc.

Abstract

Over the past 3 years we’ve been collecting daily SMART stats from the 60,000+ hard drives in our data center. These drives have over one billion hours of operation on them. We have data from over 20 drive models from all major hard drive manufacturers and we’d like to share what we’ve learned. We’ll start with annual failure rates of the different drive models. Then we’ll look at the failure curve over time, does it follow the “bathtub curve” as we expect. We’ll finish by looking a couple of SMART stats to see if they can reliably predict drive failure.

Learning Objectives

  • What is the annual failure rate of commonly used hard drives?
  • Do hard drives follow a predictable pattern of failure over time?
  • How reliable are drive SMART stats in predicting drive failure?

Reducing Replication Bandwidth for Distributed Document-oriented Databases

Sudipta Sengupta, Principal Research Scientist, Microsoft Research

Abstract

With the rise of large-scale, Web-based applications, users are increasingly adopting a new class of document-oriented database management systems (DBMSs) that allow for rapid prototyping while also achieving scalable performance. As in other distributed storage systems, replication is important for document DBMSs in order to guarantee availability. The network bandwidth required to keep replicas synchronized is expensive and is often a performance bottleneck. As such, there is a strong need to reduce the replication bandwidth, especially for geo-replication scenarios where wide-area network (WAN) bandwidth is limited. This talk presents a deduplication system called sDedup that reduces the amount of data transferred over the network for replicated document DBMSs. sDedup uses similarity-based deduplication to remove redundancy in replication data by delta encoding against similar documents selected from the entire database. It exploits key characteristics of document-oriented workloads, including small item sizes, temporal locality, and the incremental nature of document edits. Our experimental evaluation of sDedup with real-world datasets shows that it is able to significantly outperform traditional chunk-based deduplication techniques in reducing data sent over the network while incurring negligible performance overhead.

Learning Objectives

  • Replication in distributed databases
  • Techniques for network bandwidth reduction
  • Similarity detection in sDedup
  • sDedup system design

Storage Management

Overview of Swordfish: Scalable Storage Management

Richelle Ahlvers, Principal Storage Management Architect, Broadcom Limited

Abstract

The SNIA’s Scalable Storage Management Technical Work Group (SSM TWG) is working to create and publish an open industry standard specification for storage management that defines a customer centric interface for the purpose of managing storage and related data services. This specification builds on the DMTF’s Redfish specification’s using RESTful methods and JSON formatting. This session will present an overview of the specification being developed by SSM including the scope targeted in the initial (V1) release in 2016 vs later (2017). This session will also provide the positioning of the specification developed by the SSM TWG vs SMI-S as well as the base Redfish specification.


Swordfish Deep-dive: Scalable Storage Management

Richelle Ahlvers, Principal Storage Management Architect, Broadcom Limited

Abstract

Building on the concepts presented in the Introduction to Swordfish (and Redfish) sessions, this session will go into more detail on the new Swordfish API specification.

Learning Objectives

  • Introduction to the specifics of the Swordfish API
  • Working with the Swordfish Schema

Introduction and Overview of Redfish

Richelle Ahlvers, , Principal Storage Management Architect, Broadcom Limited
Jeff Autor, HP Enterprise

Abstract

Designed to meet the expectations of end users for simple, modern and secure management of scalable platform hardware, the DMTF’s Redfish is an open industry standard specification and schema that specifies a RESTful interface and utilizes JSON and OData to help customers integrate solutions within their existing tool chains.

This session provides an overview of the Redfish specification, including the base storage models and infrastructure that are used by the SNIA Swordfish extensions (see separate sessions for details).

We will cover details of the Redfish approach, as well as information about the new PCIe and memory models added to support storage use cases.

Learning Objectives

  • Introduction to Redfish concepts
  • Application of REST APIs to standards management

Testing

Uncovering Distributed Storage System Bugs in Testing (not in Production!)

Shaz Qadeer, Principal Researcher, Microsoft
Cheng Huang, Senior Researcher, Microsoft

Abstract

Testing distributed systems is challenging due to multiple sources of nondeterminism. Conventional testing techniques, such as unit, integration and stress testing, are ineffective in preventing serious but subtle bugs from reaching production. Formal techniques, such as TLA+, can only verify high-level specifications of systems at the level of logic-based models, and fall short of checking the actual executable code. In this talk, we present a new methodology for testing distributed systems. Our approach applies advanced systematic testing techniques to thoroughly check that the executable code adheres to its high-level specifications, which significantly improves coverage of important system behaviors.

Our methodology has been applied to three distributed storage systems in the Microsoft Azure cloud computing platform. In the process, numerous bugs were identified, reproduced, confirmed and fixed. These bugs required a subtle combination of concurrency and failures, making them extremely difficult to find with conventional testing techniques. An important advantage of our approach is that a bug is uncovered in a small setting and witnessed by a full system trace, which dramatically increases the productivity of debugging.

Learning Objectives

  • Specifying distributed storage systems
  • Testing distributed storage systems
  • Experience with advanced testing techniques on distributed storage systems