2016 SNIA Tutorials - Storage Visions Abstracts


Jump straight to an abstract:

The Abstracts

Privacy vs Data Protection: The Impact of EU Data Protection Legislation
Thomas Rivera

After reviewing the diverging data protection legislation in the EU member states, the European Commission (EC) decided that this situation would impede the free flow of data within the EU zone. The EC response was to undertake an effort to "harmonize" the data protection regulations and it started the process by proposing a new data protection framework. This proposal includes some significant changes like defining a data breach to include data destruction, adding the right to be forgotten, adopting the U.S. practice of breach notifications, and many other new elements. Another major change is a shift from a directive to a rule, which means the protections are the same for all 27 countries and includes significant financial penalties for infractions. This tutorial explores the new EU data protection legislation and highlights the elements that could have significant impacts on data handling practices

Learning Objectives

  • Highlight the major changes to the previous data protection directives
  • Review the differences between "Directives" versus "Regulations", as it pertains to the EU legislations
  • Learn the nature of the Reforms as well as the specific proposed changes – in both the directives and the regulations

Practical Online Cache Analysis and Optimization
Carl Waldspurger, Irfan Ahmad

The benefits of storage caches are notoriously difficult to model and control, varying widely by workload, and exhibiting complex, nonlinear behaviors. However, recent advances make it possible to analyze and optimize high-performance storage caches using lightweight, continuously-updated miss ratio curves (MRCs). Previously relegated to offline modeling, MRCs can now be computed so inexpensively that they are practical for dynamic, online cache management, even in the most demanding environments. After reviewing the history and evolution of MRC algorithms, we will examine new opportunities afforded by recent techniques. MRCs capture valuable information about locality that can be leveraged to guide efficient cache sizing, allocation, and partitioning, in order to support diverse goals such as improving performance, isolation, and quality of service. We will also describe how multiple MRCs can be used to track different alternatives at various timescales, enabling online tuning of cache parameters and policies.

Learning Objectives

  • Storage cache modeling and analysis
  • Efficient cache sizing, allocation, and partitioning
  • Online tuning of commercial storage cache parameters and policies

Massively Scalable File Storage
Philippe Nicolas

Internet changed the world and continues to revolutionize how people are connected, exchange data and do business. This radical change is one of the cause of the rapid explosion of data volume that required a new data storage approach and design. One of the common element is that unstructured data rules the IT world. How famous Internet services we all use everyday can support and scale with thousands of new users added daily and continue to deliver an enterprise-class SLA ? What are various technologies behind a Cloud Storage service to support hundreds of millions users ? This tutorial covers technologies introduced by famous papers about Google File System and BigTable, Amazon Dynamo or Apache Hadoop. In addition, Parallel, Scale-out, Distributed and P2P approaches with Lustre, PVFS and pNFS with several proprietary ones are presented as well. This tutorial adds also some key features, such erasure coding, essential at large scale to help understand and differentiate industry vendors offering.

Learning Objectives

  • Understand various technologies around File Storage at megascale
  • Anticipate the recent technology wave around distributed storage with design based on Google or Amazon research papers
  • Receive key elements and arguments to select the right solution for various needs

Fog Computing and its Ecosystem
Ramin Elahi

In relation to “cloud computing,” it is bringing the computing and services to the edge of the network. Fog provides data, compute, storage, and application services to end-users. The distinguishing Fog characteristics are its proximity to end-users, its dense geographical distribution, and its support for mobility. Services are hosted at the network edge or even end devices such as set-top-boxes or access points. Thus, it can alleviate issues the IoT (Internet of Things) is expected to produce such as reducing service latency, and improving QoS, resulting in superior user-experience. Fog Computing supports emerging Internet of Everything (IoE) applications that demand real-time/predictable latency (industrial automation, transportation, networks of sensors and actuators). Thanks to its wide geographical distribution the Fog paradigm is well positioned for real time big data and real time analytics. Fog supports densely distributed data collection points, hence adding a fourth axis to the often mentioned Big Data dimensions (volume, variety, and velocity).

Implementing Stored-Data Encryption
Michael Willett

Data security is top of mind for most businesses trying to respond to the constant barrage of news highlighting data theft, security breaches, and the resulting punitive costs. Combined with litigation risks, compliance issues and pending legislation, companies face a myriad of technologies and products that all claim to protect data-at-rest on storage devices. What is the right approach to encrypting stored data? The Trusted Computing Group, with the active participation of the drive industry, has standardized on the technology for self-encrypting drives (SED): the encryption is implemented directly in the drive hardware and electronics. Mature SED products are now available from all the major drive companies, both HDD (rotating media) and SSD (solid state) and both laptops and data center. SEDs provide a low-cost, transparent, performance-optimized solution for stored-data encryption. SEDs do not protect data in transit, upstream of the storage system. For overall data protection, a layered encryption approach is advised. Sensitive data (eg, as identified by specific regulations: HIPAA, PCI DSS) may require encryption outside and upstream from storage, such as in selected applications or associated with database manipulations. This tutorial will examine a ‘pyramid’ approach to encryption: selected, sensitive data encrypted at the higher logical levels, with full data encryption for all stored data provided by SEDs.

Learning Objectives

  • The mechanics of SEDs, as well as application and database-level encryption
  • The pros and cons of each encryption subsystem
  • The overall design of a layered encryption approach