Over the past few years a number of new protocols have been devised to coherently connect the various memories in a computing system: OpenCAPI, CXL, and GenZ. In some cases these efforts have spawned other new ideas like OMI. What are all of these? How did they come into being? Why do they make sense? And how do they compare against one another? This presentation will explain each in simple terms to show why these were developed and where each one fits.
Data centers are rapidly evolving with the advent in newer technologies and varied market needs. Lower TCO, long term device reliability, better cooling efficiency, lower power consumption without compromising performance becomes the key factors to a successful long term deployment architecture. Adoption of better dense warm/cold storage and right scalable methodology with Composable Disaggregated Infrastructure (CDI) in data centers is the way to go. On another front need for low latency, high performance & secure data portability in extreme environments have paved way for the new age edge servers. This presentation will take you along on important trends to be embraced, which will enable storage experts to make the right choice for their companies and end customers.
Emerging and existing applications with cloud computing, 5G, IoT, automotive, and high-performance computing are causing an explosion of data. This data needs to be processed, moved, and stored in a secure, reliable, available, cost-effective, and power-efficient manner. Heterogeneous processing, tiered memory and storage architecture, infrastructure accelerators, and infrastructure processing units are essential to meet the demands of this evolving compute and storage landscape. These requirements are driving significant innovations across compute, memory, storage, and interconnect technologies. Compute Express Link* (CXL) with its memory and coherency semantics on top of PCI Express* (PCIe) is paving the way for the convergence of memory and storage with near memory compute capability. Pooling of compute, memory, and storage resources with CXL interconnect will lead to rack-scale efficiency with efficient low-latency access mechanisms across multiple nodes in a rack with advanced atomics, acceleration, smart NICs, and persistent memory support. In this talk we will explore innovations in PCIe and CXL, and look at multiple use cases for application of the technology.
Security is top of the mind for CxOs and it has become a board level discussion in every organization. In this session, we will talk about technology trends and innovation in data protection and data security across Edge, Core and Cloud. Cyber security & ransomware are every day talk, security must be transformed with intrinsic security. What are we doing? How do we protect and recover fast from the cyberattacks?
This session will be in the form of a discussion between Vasu and Joji as they examine the impact of emerging technologies on mid-range storage solutions.
Since 4,000 B.C. when the Sumerians invented the first system of writing, they used clay tokens to represent the trades and deals they were making. To prevent tampering with these tokens, they then sealed the tokens in clay jars. This ensured that the data would remain untouched by any people and would securely preserve data over time. Data Protection has evolved quite a bit since then. From traditional backup to recover to to modern data protection, it morphed to Data Resilience which is where we are today. The session provides guidance on the best practices necessary to Cyber Resiliency. In addition, it insights the technologies available from IBM to help implement these best practices.
There is a boundless inclination towards transforming the applications from monolith to microservices-based architecture across the enterprise. However, even before joining the race, there are many crucial questions to be answered first. For instance – What will be the scalability of the designed microservices? What migration patterns can be adopted? Is microservices required for the application?
With some real-world example from enterprise applications, in this talk, we will cover domain-driven design to organizational change models and understand if starting this journey is even suitable for the specific application. Further, we dive deeper into the technical aspects of decomposing a monolith, exploring real-world examples and extracting the migration patterns. We conclude by discussing the pain points and the challenges faced by the enterprise when the microservices architecture grows.
The Open Data Framework (ODF) unifies data and storage management from the core, to cloud and to edge. In this talk, we will show how ODF simplifies Kubernetes storage management, provides data protection for applications, and connect data on-prem to clouds. We will also be introducing how ODF can be extended with other SODA projects such as DAOS - a distributed asynchronous object storage for HPC, ZENKO - a multicloud data controller with search functionality, CORTX - an object storage optimized for mass capacity storage and others (YIG, LINSTOR, OpenEBS). SODA Foundation is a Linux Foundation project focused on building an ecosystem of open-source data management and storage software for data autonomy.
Container Storage solutions are getting increased traction. Most of the vendors working on Storage as a service kind of solutions with their storage products. In this session, we will explore potential solutions for hybrid data management and storage monitoring across heterogeneous storages (including cloud storages) using open data framework. We will illustrate the overall solution using two specific use cases a) Data Protection across on prem-cloud and b) Heterogeneous storage monitoring (where CSI has gaps, esp for storage monitoring). We will conclude with some thoughts on the future direction and pointers on container storage data management.
This presentation aims to describe how to manage containerized software-define-storage software having kernel module through combination of Special Resource Operator and custom Operator. A container is a standard unit of software that packages the code and all its dependencies, so that the software can run on any computing environment. Containers are very lightweight compared to Virtual Machines as they share the same OS kernel. Many container orchestration platforms like RedHat Openshift mandate that every software be run as a container. There are no software installation tools available and file systems on root disks are ReadOnly, and so that no software can be installed on the host. Expectation is that software's lifecycle like deployment, upgrades etc. would be managed through Operator. An Operator is a method of packaging, deploying, and managing container applications – basic install, upgrades, failure recovery etc. Many software’s which consist of kernel modules are not shipped by default with the standard Operating Systems. These kernel modules are required for proper and performant functioning of special hardware like disk devices, HBA cards, software defined storage etc. This proposal discusses the use of two Operators – Special Resource Operator (SRO) and Custom Operator (CO) to seamlessly manage deployment, carry out software rolling upgrade and kernel/OS rolling upgrade when dealing with storage software having kernel modules.
Summary: This paper describes various design and implementation changes in iSCSI interconnect layer while using native software stack in comparison to iSCSI offload stack from NIC hardware vendors like Chelsio or Mellanox. Also, this study provides more details about how performance characteristics varies in both the iSCSI models.
Abstract: Existing Ethernet Data centers uses iSCSI as a storage interconnect, mainly due to low cost infrastructure and its suitability to deploy with minimum network administration. Recently though, with advent of smartNIC technologies by NIC vendors, iSCSI and TCP/IP functionality could be offloaded to high performance CPUs on adapter. This allows application to utilize host CPUs better for more important mainline functionalities e.g access backend database. Host stack, however, provides better control for storage application with new features and APIs usages compare to offload stack where fixed set of transport APIs are used to access external network. Also Host stack functionality works on standard NIC adapters from low speed network of 1G to 100G and does not need any special functionality on NIC adapters to support storage protocol like iSCSI.
This paper also talks about challenges while building such storage applications and other practical considerations, given that storage applications are quite different from standard applications running on server environments.
With new emerging high-speed storage protocols like NVME-TCP, NVME-RDMA and low-cost infrastructure deployments, Ethernet as storage interconnect is becoming more and more eminent. Since Ethernet is widely used standard it provides various options for deployment like different standards, protocols, and specifications. With breakneck speed of new technology development and new standards coming up every few years, interoperability issues come to the forefront while designing/implementing an Ethernet Storage solution specially for cloud solutions. For example different ways of implementing FEC on different NIC vendors can cause serious issues like link becoming inactive.
This talk shares some of our first-hand experiences in our journey in Ethernet storage world with Ethernet interoperability that arise due to the prevalence of numerous standards for various aspects of implementing and deploying Ethernet storage. e.g. Host- interop, hardware-interop etc. In this talk we will discuss these issues and the list of watchpoints and pitfalls for designing, implementing, and deploying an Ethernet storage solution. This paper will give insights about the practical considerations one has to keep in mind while deploying Ethernet storage solutions to avoid the challenges that we have faced. This paper will also talk about considerations one has to take for moving their solution to cloud.
The concept of Object Storage was introduced in 1990’s and since then it has evolved into a mainstream technology of choice for unstructured data. Today it has become the cornerstone for storing, protecting, and managing unstructured as well as semi-structured data. Unstructured and semi-structured data are predominantly generated by Humans and Machines. These are as high as 90% of the total enterprise data most of which remain dark data. This data is growing so fast that, as per Gartner Critical Capabilities report by 2024, large enterprises are going to triple their unstructured data stored as file or object storage on premises, at the edge or in the public cloud, compared with 2020.
Object storage through the years: 1st Generation object storage solutions were focused on data protection, archival and compliance. 2nd generation object storage solutions were focused on hybrid cloud, s3, metadata enrichment etc. But today the object storage has gone way beyond, is catering to almost all the tier-1 workloads.
Digital transformation has accelerated adoption of Cloud, social media, Mobility, IoT, and Big data; as a result many new data sources and modern application workloads are getting discovered and are driving the object storage consumption. The applications such as ultra-high-performance AI/ML workloads, IoT and unified data warehouses are leading this race.
According to another market projection (IDC), 80% of the data generated worldwide will be unstructured.
As analytics usage increases, organizations want faster performance to speed data-focused insights that can inform business decisions. A high-performance object storage solution can enable additional modern and cloud-native applications to gain the benefits of its scalability, fast data retrieval, and cost effectiveness.
Modern Data Platform: Navigating these challenges requires support for multiple use cases with visibility and control across all data. A modern digital platform enables easy management of volumes of data, seamless response to application demands and wide data accessibility, while satisfying compliance requirements. And hence, a modern data platform must have following characteristics/pillars:
1. Simple, Secure and Smart
2. Software Defined
3. Metadata-based
4. Limitless and Global
5. Multi-protocol and multi-cloud
6. APIs Driven
Traditionally, organizations have turned to object storage for highly scalable, cost-efficient, long-term storage with easy data retrieval—but not for performance. But today, organizations need to use all their data, not have some data stored in slow silos, where it is available in an emergency but not for production applications. Too much insight is available from data lakes, content repositories, and email archives for organizations to waste money storing it without usable performance.
Composing together the individual atomic methods of concurrent data structures (CDS) poses multiple design and consistency challenges. In this context composition provided by transactions in software transaction memory (STM) can be handy. However, most of the STMs offer read/write primitives to access shared CDS. These read/write primitives result in unnecessary aborts. Instead, semantically rich higher-level methods of the underlying CDS like lookup, insert, or delete (in case of hash-table or lists) aid in ignoring unimportant lower-level read/write conflicts and allow greater concurrency.
We adapt the transaction tree model in databases to propose Object-based STM (OSTM) which enables efficient composition in CDS. We implement OSTM with concurrent closed addressed hash-table (HT-OSTM) and list (list-OSTM ) which exports the higher-level operations as transaction interface. Experimental evaluation shows that HT-OSTM and list-OSTM outperform state-of-the-art STMs and incur negligible aborts.
As we are reaching the back half of 2021, it is time to see what is going on with the CS TWG and CS SIG around bringing Computational Storage into the mainstream storage architectures. While SNIA is continuing to drive the Architectural Document work, the presentation will provide some insights into the first implementation of CS to real products, through our partner organization, NVM Express. They have a working group on the topic as well, and the efforts of the two standards are maintained in lockstep. We will leave the viewers with a preview of the latest version of the Architectural document and links and materials available to show the direction and growth of Computational Storage in the market.
SSDs require a "Flash Translation Layer" (FTL) to manage mapping between the sequential LBA space exposed to its client (referred to as "the host") and the actual "physical layout" of the corresponding data on the NAND flash media being managed. The FTL also serves to transparently manage the nuances of the underlying NAND flash cells. Since underlying FTL’s are designed to handle general-purpose workloads. This results in inefficiencies in the form of "Write Amplification" (WAF) or "Space Amplification" (SAF).Certain new and upcoming specifications allow "special" SSDs to assume some of the FTL's responsibilities. ZNS is one among them. This proposed specification is an extension to the NVMe standard that allows for a host-based FTL to dictate physical layout, garbage collection and high-level error management, while the underlying SSD firmware primarily focuses on wear levelling and media management. Thus, providing the multiple benefits, the special SSD which is usually referred as "SDF/ZNS drives" . This will be explained with "ZNS use cases”.
The last decade of computing has made enormous progress in storage technologies. Improvements in hardware like SSDs and bus protocols like NVME have the potential to reach IO speeds close to DRAM.
The Ultimate Industry goal is to unify the storage model where all storage can be mapped to a process address space.
Until these hardware technologies reach a comparable latency to DRAM, the industry is still inclined to follow the Von-Neuman model. There is one problem to this goal though, the operating system latency involved in every IO. One of the mitigations to this issue is performing asynchronous IO. Unfortunately, a lot of legacy applications do not have the intelligence to perform asynchronous IO. It requires a huge investment to convert application logic from synchronous IO to asynchronous IO. Kasync is a method which converts a synchronous IO to asynchronous without change in application logic, or even the need of recompiling them.
Discuss on new features added in CXL 2.0 . Impact of persistent memory on CXL. Future memory and storage solutions with CXL. Discuss CXL attached memory solutions vs DIMM based memory solutions . Expansion system memory and use classes with CXL.
Security attacks are growing everywhere. With data breaches continue to take at an alarming pace every year, it’s paramount to invest in securing the infrastructure system on which data resides. In this talk, we will learn the principles behind building trustworthy systems, reduce the incidence of security attacks & lower risk to customers.
More and more IoT devices are coming online daily, joining the billions that are already connected. They generate a huge amount of data which needs to be stored and analyzed in real time. The cloud can really help in speeding up the creation of such a data store. However, proper thought has to be given to design a scalable and secure IoT data store.
In this talk, I will start by explaining the typical kind of data IoT devices generate. I will then go on to explain how cloud storage can be used to store this data. I will cover different IoT platforms like DeviceHive and also some out of the box solutions provided by popular cloud providers.
I will then go on to explain the possible security challenges for storing IoT data and how we can mitigate them.
This session talks about Data Protection Trends & with tremendous growth in data, how to protect it in the most efficient way. It would also cover Data Protection in multi-cloud/hybrid cloud world including cloud-native workloads. With increasing ransomware attacks globally, this session also focusses on key industry trends around ransomware protection & how business should be ready with solutions which can provide recoverability and availability of business-critical data, in case production is compromised.
Everyday huge unstructured data gets generated that needs to be processed and used for Artificial intelligence (AI) and IoT (Internet of Things). The intense investment in AI and IoT leading to Rapid innovation in these technologies. These technologies rely on cloud for storing structured and unstructured data, fetching and processing data from cloud is slow due to increased latency and also power consumed will be high, apart from these issues End user is not comfortable in storing data in the cloud due to privacy issues. These issues are pushing AI application to have their intelligence implemented at the edge of network instead of data centers.
Storage is important module at the edge network that can accommodate memory hungry AI processes like Training which uses off-chip memory to keep up with performance improvements. As AI and IoT are becoming famous many professionals and companies use Boards like Raspberry Pi and Arduino which are small inexpensive Boards that allows connecting to various external accessories such as sensors and create Applications.
Famous Boards like Raspberry uses MicorSD card that are of low cost for booting and storing data. This paper explains methods to get benchmarking results and lists important parameters and their values that helps to evaluate which cards needs to be chosen to implement any Applications on Raspberry board. This paper also explain method to emulate one of the AI application that helps to find out life time of MicroSD card for various workloads.
Hadoop becomes imperative to process a large and complex set of data. However, often the issue of architecture scalability pose unnecessary roadblock in this process. OpenStack – an open-source software instils the required operational flexibility to scale-out architecture for Hadoop. This session will throw light on how we helped a client install Hadoop on VMs using OpenStack. We will discuss in-depth the challenges of manual operations and how we overcame them with Sahara. The audience will also learn why we virtualized hardware on the computing nodes of OpenStack and deployed VMs.
Did you know that 96 percent of firms have suffered at least one outage in the last three years? It is a fact that firms suffer a monetary and moral loss during the blackout of your data center and in the absence of any data recovery and backup options. This session will discuss the real-life scenario where we helped an ISV upgrade its K8s backup platform for end-to-end data management and disaster recovery. The session will provide complete insights into the challenges and steps to install Rook CEPH using K8s storage classes. The audience will gain a thorough understanding of the architecture, including CEPH RBD mirroring. We shall also expand on the role of ArgoCD as cluster and application configuration manager.
Applying AI at the edge and endpoints often requires working under non-data center environments and in power constrained conditions. AI inference also requires significant memory to hold weighting values from training. New non-volatile memories can help provide more memory in a given device die and use less power than NOR flash, SRAM or DRAM. This presentation will talk about changes in the memory/storage hierarchy and how it will change memory and storage in data centers and embedded devices to support energy efficient and low latency AI applications.
Your business data represents your intellectual capital, competitive differentiation, and lifeblood of your organization. And not all data is equal. With clearly defined service level agreements for data protection, you can enable your Digital Transformation. Learn how to protect your enterprise application data depending on workload requirements such as backup, disaster recovery and compliance, and even archive to the Cloud for long-term retention. Empower your storage administrators today!
In this talk you will learn about the reasons for building a specialized storage system for databases, and how Aurora databases are built using such a storage system in cloud.
The compute, network and storage are geared towards getting higher throughput and IOPS, creating immense amount of data. Organizations are opting for hybrid, multi-cloud and data center solutions for hosting data. And increasingly complex principles and rules now govern handling various types of data. Are the data management solutions around it keeping up?
As Software Defined Storage (SDS) and Hyperconverged Infrastructure (HCI) become popular, users are hitting some limits in scalability & performance for these solutions. Typically, Mirroring & Erasure Coding are existing data protection techniques deployed for SDS & HCI. However, each of these techniques have their pros & cons. In particular, network has become the performance/scalability bottleneck. Hitachi has developed a new Polyphased Erasure Coding technique that can solve this problem and remove the barrier to higher scalability. Come to the presentation and find out more!
With the evolution of containers over virtual machine, Industry is adopting heavily on container-based solutions where workload portability is key focus. Most of the enterprise application are stateful which requires data to be stored on persistent storage. In this presentation we are discussing about how containers are using the persistent storage using CSI. We will discuss about CSI architecture and how 3rd party storage vendors are writing CSI driver for their storages. We will discuss how industry is adopting the CSI features for their storage services. Then we are going deep down to one of the upcoming CSI features in kubernetes called CSI Volume Health Monitor and discuss it’s architecture and implementation progress.
HPCC Systems, an open source cluster computing platform for big data analytics consists of Generalized Neural Network bundle with a wide variety of features which can be used for various neural network applications. To enhance the functionality of the bundle, this paper proposes the design and development of Generative Adversarial Networks (GANs) on HPCC Systems platform using ECL, a declarative language on which HPCC Systems works. GANs have been developed on the HPCC Platform by defining the Generator and Discriminator models separately, and training them by batches in the same epoch. In order to make sure that they train as adversaries, a certain weights transfer methodology was implemented. MNIST dataset which has been used to test the proposed approach has provided satisfactory results. The results obtained were unique images very similar to the MNIST dataset, as it were expected.
Sparing mechanism is one of the factors to be considered for system reliability. However, configuring more spares than required, or shortage of spares, have their limitations. A shortage can increase the restoration time if there is a failure – thus, reducing availability and an excess of spare parts has its own financial implications.
Usually, the traditional sparing mechanism fails to achieve the dilemma of excess and shortage of spares because their first level of “sparing method” (in any storage system) is to identify the “failed/fail-ing” disk in the RAID group and then replacing it with the spare drive (they can use any novel sparing mechanism). This is where the problem manifests – the so-called “bad drive” is labelled (Good or Bad) with failed/failing disk by underlying statistical or threshold-based machine learning methods. This approach still has its drawbacks; most prominently, it is a static (0 or 1) way of denoting a disk as either “Good” or “Bad”.
To overcome this issue, an intelligent sparing mechanism is presented. The method is based on the “degree-of-badness” and helps in identifying which RAID group needs more attention and how to assign the spares. Moreover, it provides an automated configuration change for the spare drives – a healthier (low degree of bad disks) RAID group has low spare disks assigned, and whenever the degree-of-badness increases, the number of spare drives is increased.
Today, data plays a more crucial role for any business and with passing time role of data is ever increasing. For the whole software industry, primarily Data is the pivot. Storing these data in a cost-effective way is the need a necessity. But to determine when and where to store the data effectively and efficiently is the key. Consider a data used rarely but lying on your primary storage. This will have a high cost with less ROI. But considering various aspects, deleting these data may not be possible. So to move those data into secondary storage or archival storage or Cold storage can save cost as well as maintain compliance.
Data being generated and accessed can have a trend and pattern. Understanding those trends can help to be smart and decide when and which data to be Archived. These trends can be simple heuristic-based or based upon accessibility patterns or a catalog defined by the admins.
SODA Foundation is an Open Data Framework for data On-prem, on Cloud, and at the Edge. SODA has core projects which support Data Archival onto on-prem Cold Storage like SONY ODA and to Archival/Cold storage of different cloud providers.
This new solution intends to use this underlying platform and based upon the AI/ML/Cognitive methods or a policy-based catalog, the data can be smartly moved to and from Archival storage. This data movement intelligence reads the metadata and based upon defined algorithms moves data.
The intent is to automate the data migration to and from the archival storage across the hybrid environment and save cost as well as maintain compliance
Data Management is key Value ingredient in Data center Analytics. This involves File systems and Object store analysis, software scans, maps which enables data and analytics leaders to make better data management decisions for unstructured data. This in turn reduces risk and lowers costs associated with data storage and ensures higher performance per TB invested.
Data classification for business value is a topic that more relevant than ever and more organizations are curious to understand their investments and rationalize the same based on file and object store analytics. Various vendors in this space differ in algorithmic choices made for File scanning,indexing and represeting the historical and current trends of Data layout. There can be associated performance implications and learnings by understanding the choices in design and architecture decisions. Depending on the storage methods of the metadata scans, in many cases there comes the questions on how much scale and performane can be expected for given solutions. Also implications on level of scanning can have implication on Latency and retreival of analytics for end user. Once the analytics part is hardened advanced steps can involve scanning content and making decisions to either move the content to a new destination or delete it to optimize efficiency and lower cost of ownership.
In this talk resources lets walk through some of the Performance implications and learnings for managing unstructured data and providing operational efficiency for users in engaging with the data.
Supercomputers running HPC workload generate vast amounts of data, typically organized into large directory hierarchies on parallel filesystems. Size of dataset can go from millions to few billions of files spread across thousands of directories with varying depth and span. Hence, performance of metadata operations become a crucial factor for overall system performance. File create and delete operations inside directories are usually serial. This is because the parent directory lock needs to be taken to protect directory against concurrent write access. Persisting metadata is found to be expensive that contribute the most to the overall latency.
We propose a batch commit technique to improve the metadata write I/O performance. Experiment shows that we improve metadata write I/O performance by around 50 percent. The supercomputing applications often require to perform query on directories. Typically these operations require complete directory traversal. In UNIX and UNIX like operating systems ‘find’ is a command that helps applications to locate files based on certain user-specified criteria and then apply some requested actions on each of these matched items. The ‘find’ command uses readdir, a POSIX API to read a directory and after getting the complete list of directory entries, it applies some filter to select a few of them that satisfy the user-specified criteria. It typically takes a regex (regular expression) pattern and does a pattern match with all the entries that are obtained from reading the target directory. User sometime may specify some attributes like size, last modification time etc. and also some selection criteria to match with those attributes. In that case the ‘find’ command needs to get the file attributes of those whose name matched with user-provided regex pattern by using getattr POSIX API. Finally, the ‘find’ command may perform some action on all the selected items like deletion or change file permission and so on. For local file-system this may be acceptable, but for remote file-system this requires the client to reach the server multiple times. Firstly to get all the directory entries, then to get file attributes of those that matched some user-provided pattern and finally to perform some action on matched items. If number of directory entries is too large, a single request may not be able to fetch all the entries, hence that needs some additional round-trip. Several round-trip and especially fetching all the entries and performing an elimination based on some user-specified criteria at client side is a complete wastage of valuable network bandwidth and both server and client CPU. Some protocol like NFS provides an extension over standard readdir API, called readdirplus that along with the directory entries also returns the attributes to optimize the number of network round-trips. But this requires extra disk I/O to get file attributes of directory entries don not satisfy user criteria. Furthermore, the data transfer over the network in this scheme becomes even higher. We propose a near data processing technique that solves all the aforementioned problems and improves the performance of ‘find’ command by more than 100 times.
Zoned storage devices are a class of storage devices with an address space that is divided into zones which have write constraints different from regular storage devices. Each zone written sequentially and reset before rewriting. The main type of ZBD currently available is SMR HDDs. The NVMe Zoned NameSpace (ZNS) is also recently standardized and defines a zone abstraction for NVMe namespaces similar to SMR HDD zones. The benefits of ZBD over traditional storage are better capacity, higher performance and lower cost. The initial support for Zoned Block Device in native filesystem was added in Linux kernel 4.10. Since then continuous changes are happening in kernel for this device support.
Among the existing Linux filesystems, F2FS already supports zoned devices, and allows normal operations on such devices. In addition, zonefs, a special filesystem for zoned devices, was included in the 5.6 kernel. Using zonefs requires applications designed for this purpose, as the filesystem does not support the creation of normal files. For Btrfs, the basic support is present now. Also the performance improvement and some other enhancement in Btrfs are added in kernel 5.12. Zoned block devices behave differently than the traditional ones, so filesystem requires layout changes to support those devices.
This talk is inspired by work of Damien Le Moal and Naohiro Aota and their patches submitted in Linux kernel. It will cover core principles of ZBD and the work done so far in native filesystems to support this device.
An innovative AI/ML based approach that simulates disk storage system customer environment for product qualification. Solution includes an automated framework that fetches customer system telemetry data and performs analysis to create ML models. These ML models recommend desired configuration and feature-sets which are most closure to maximum number of customer disk storage systems. This solution also analyses customer reported issues and tunes system configurations and features which are prone to error. This end to end automated solution not only aligns test environment with most commonly found customer telemetry data, but also ensures that product is qualified with real customer environments.
Anomaly detection is the identification of rare events, abnormal data or trend comparing it with its past trend data.
In the event of Ransomware / Malware attack on a system, it can change some data on the system. For example, a Ransomware attack can encrypt all data on system or delete files on the system. So in this case if we have a past trend of activities or attributes such as files count or size, Anomaly detection can be used to identify if there is any Ransomware attack on the system.
A Backup and Recovery product collects various backup meta data during backup process at certain specified frequency. A Ransomware / Malware attack can possibly alter those backup meta data and using Anomaly detection after the backup is completed, we can identify the changes in meta data if they are anomalous or not to correlate it with attack. This presentation explains how Anomaly detection using machine learning capabilities can be used in Backup flow to identify the Ransomware / Malware attack on the system. Also, this presentation will outline certain actions that users can take to enrich learning of Machine learning model for better detection capabilities.
Prediction is power.
Although, in general, predictions are not new. Since the beginning of human civilization, we all are predicting possible events that can happen in the future, using some or other approaches. The recent revolution in data science, machine learning, and artificial intelligence together is doing miracles in technology-based predictions. These predictions can be a game-changer for almost all businesses, as they can empower them to make the right business decisions. Prediction Analytics is a technique to look into the past, understand it thoroughly, and then based on it, predict the potential future events. Prediction is art and science as well.
There is this popular saying in data science– everything can be predicted – who will click, who will buy, who will lie, and even who will die.
There is no exaggeration in this statement. Statistically speaking, everything is predictable up to a very large extent. Insurance companies worldwide are doing this for so many years historically. We can predict which customer will buy which product. Also, we can predict which customer can buy something extra if promoted more or influenced more. The job site can predict the most possible match based on skills and requirements. An entertainment business can predict which movie you will like to watch next or a music application can predict which song you will like to listen to on Monday morning. Before elections, newspapers and media predict who will win the elections based on opinion polls and surveys. Social sites can predict possible 'friends you may know' or dating sites can predict potential "matches". Predictive analytics is not merely a corporate buzzword, but it is the philosophy by which compliance products are inspired and driven. Almost all digital compliance products deal with huge text data of customers in various formats like emails, lose files, images, audios, videos, and so on. They can understand it well and then predict. Just, for example, eDiscovery’s predictive coding (ref), advance supervision (ref), Information classifier (ref).
Most compliance products have abilities to provide visibility into customer data by enriching metadata around various business workflows and surfacing relevant information. They can find the most actionable content, that is more relevant, that is narrower. Lastly, in the coming future, data growth will be massive. Data will be the new oil. Businesses that develops intelligence to truly “understand” the data by reading between the lines, will thrive the market. There is huge potential to infer from data and predict actionable insights and this is going to be a continuous process, forever.
AI and Machine Learning is touching upon almost every walks of life. Systems and particularly storage systems and data services are no exception. In this talk, the various facets of AI/ML applications in storage systems will be discussed. First, we will discuss the type of AI for systems and then discuss some specific research directions of storage systems research that are being benefitted by using AI/ML.
Software Transactional Memory systems (STMs) have garnered significant interest as an elegant alternative for addressing synchronization and concurrency issues with multi-threaded programming in multi-core systems. Client programs use STMs by issuing transactions. STM ensures that transaction either commits or aborts. A transaction aborted due to conflicts is typically re-issued with the expectation that it will complete successfully in a subsequent incarnation. However, many existing STMs fail to provide starvation freedom, i.e., in these systems, it is possible that concurrency conflicts may prevent an incarnated transaction from committing.
To overcome this limitation, we systematically derive a novel starvation-free algorithm for multi-version STM. Our algorithm can be used either with the case where the number of versions is unbounded and garbage collection is used or where only the latest K versions are maintained, KSFTM. We have demonstrated that our proposed algorithm performs better than existing state-of-the-art STMs on the applications of STAMP benchmark.
Containers and Kubernetes are changing the way enterprises are building scalable applications. There is a also a significant shift in how infrastructure, storage and networking are being managed. These cloud-native applications are blurring the line between on-premise and cloud. The requirements for elasticity, scalability, resiliency, performance, QoS and security are driving the storage and data protection requirements in a very different way than earlier.
Cache(s) are generally used to reduce the average cost of accessing the content, and that’s why they are designed to work with faster storage (such as DRAM). In this talk, we will be covering how we can take advantage of large PMEM devices (comparatively cheaper in cost with slightly higher r/w latency compared to DRAM) to reduce our amortized access cost.
In terms of performance, we will discuss how PMEM in volatile mode can entirely replace DRAM, still offering 3X performance benefits in certain cases. We will also discuss how PMEM offers an option to make cache persistent, resulting in improved cache rebuild times during planned and unplanned failovers.
We will iterate over various PMEM modes and why we chose FS-DAX now and plan to use KMEM-DAX in the future for our volatile caching. We also plan to discuss scenarios where we got up to ~3X improvement vs ~4% degradation and various techniques to get the best of both worlds.
We also show how incorporating PMEM in Hyperconverged Infrastructure(HCI) solution results in:
- running more user VMs
- better hardware utilization
- lower cost
As a case study, we will look at the improvements we got with PMEM in the Nutanix HCI solution. In the end, we will share our experiences during this work and the things to consider while configuring PMEM.
NVME-oF arrays are gradually replacing the traditional and HDD based storage system in data centres. The primary reason for this is the improving performance and reducing cost for NVMe-oF arrays due to increased production of NVMe drives. Modern day applications are demanding better and better performances from hardware. During last few decades, any storage system would be bought based on, how well it adheres to the specification and deliver performance evaluated with benchmarking tools. Today enterprises are more demanding on actual run time-based numbers and real world workloads. Most enterprises are looking at application specific benchmarks, instead of standard tools. In our talk we will highlights a proven methodology, focused primarily on testing different industry-based solutions like NVMe-oF arrays to provide fair and accurate results and how these solutions should be tested and validated. Additionally, a comprehensive technique is discussed to cover most of the industry wide applicable solutions for qualifying any NVMe-oF array solutions. This presentation will discuss how to qualify a NVMe-oF array with more than ~95 % of industry standard solutions so that the storage array does not have any interoperability issues with these solutions/applications such as Software-defined-storage, Databases, Ceph, Containers, Virtualization.
The Solid State Drive (SSD) capacity and performance is increasing over the years and due to this thermal management in SSD is becoming very challenging. As SSD works harder to meet growing demands, it consumes more power and consequently generates more heat.
Apart from designing the SSDs to tolerate higher temperature, another effective solution to address the overheating issue is to have a good thermal management. The thermal throttling mechanism is thus fundamental for SSDs. A thermal sensor is used in the drive to monitor the temperature and temperature details can be extracted via S.M.A.R.T. command.
The idea of this paper is to Study thermal impact of different types of IOs, chunk size in multi core SSD with different density, Die Size and NAND Type (SLC & TLC) as parameter. This study will help us in understanding & improve the thermal throttling policy.
Engineers at Nutanix have been working on the challenge of building a next-generation architecture for its distributed storage fabric. Scaling this architecture to the needs of the future required three primary objectives: significant improvements in sustained random write performance, support for large-capacity deep storage nodes for multi-petabyte scale and reducing storage latency by a significant magnitude.
These goals required re-imagining the core approach to how metadata is stored in the fabric management system and move the metadata closer to where is the data is stored. After extensive research and testing, RocksDB was chosen as the core component for this project, based on its open-source pedigree and proven reliability and industry adoption. Within a few months, the engineering team was able to ramp up expertise, build confidence with the open-source technology and eventually grow its adoption into several core products at Nutanix.
In this technical talk, we will share the new architecture, deployment mode and some of the early lessons learned in adopting RocksDB and discuss some innovative enhancements we were able to make to fit our performance goals and objectives. One of the significant improvements has been the addition of async read/write support to RocksDB. Currently, the open source RocksDB exposes blocking I/O APIs which can limit overall system throughput under resource constraints. We developed a Fibers/Co-routine based non-blocking I/O solution for RocksDB.In addition to this, we plan to talk about topics and projects that have been built on this enhanced RocksDB implementation. These projects will become the foundation for the Nutanix future products.
Apache Ozone is a robust, distributed key-value object store for Hadoop with layered architecture and strong consistency. It provides Object Store semantics (like Amazon S3) and can handle billions of objects. Apache Ozone object store recently implemented a fast atomic rename and delete operation with O(1) complexity. This dramatic optimization lowers the job latency equals lower total cost of ownership (TCO) for analytics workloads. As we know, most big data analytics tools like Apache Hive, Apache Spark, etc. often write output to temporary locations and then rename the location at the end of the job to become publicly visible. In the analysis of Object Store(like Amazon S3), rename is not a native Object Store operation, it is implemented using a costly copy and a delete operation. The rename operations can often take longer than the analytics process itself. These job committers demand atomic rename for improved performance as well as consistent listing operations. This talk will be a deep dive into the Apache Ozone architecture that describes the atomic rename and delete implementation, which greatly boost the analytics job performance. We will walk through performance benchmark results that show a consistent performance gain in various analytics workloads. Finally, we will also talk about a future roadmap to leverage this new design to achieve efficient lock management for namespace operations by avoiding global locks.
Ceph is an open source, distributed, highly scalable, unified software-defined storage solution. It is often used as a S3 Object Storage backend, which can handle petabytes of data, using Ceph Rados Gateway (RGW).
In this session, we will discuss how the data can be replicated or migrated from these RGW servers on On-Prem clusters to multiple cloud providers, thus easing data movement in a hybrid cloud environment.
Object Storage is increasingly getting used on premises for a variety of use cases. Some of these use cases and applications require file based access along with object based access to the same data. Object Storage provides a rich interface for managing data, for example versioning of objects, lifecycle policy and WORM capabilities. Some file based access use cases benefit from these rich data management functions.
In this presentation we focus on what it takes to build a performant and coherent file access to a scale out Object Storage System.
Digital transformation is driving the need for application modernization by embracing microservice architecture based on Kubernetes containerization. On the other hand, edge computing has started becoming a reality in industry verticles like Manufacturing and Telecom. Edge use cases are generating exogenous magnitude of data from the edge sources and moving or copying the data for processing in a centralized hub or cloud is time-consuming and can prove very expensive. These use-cases and associated data governance challenges are driving the Data Gravity metaphor where application and services are attracted near to the data instead of the other way round. Hyper-Converged Infrastructure (HCI) designed for application modernization is a natural fit for such Data Gravity driven Edge Computing workloads. In this session, we present the key characteristics of data gravity-driven edge computing use cases, the suitability of App modernization for such use cases, and vendor-neutral viewpoint on design and feature functionality for HCI systems catering to such use cases.
NVMe technology is rapidly evolving to bridge gaps in memory industry, NVMe spec.1.4 is completely adopted and next refracted version is in pipeline with more streamlining. Realization of new features into real device will take longer time and face challenges like cross/legacy module impact and functional blockage. Readiness of host test infrastructure & keeping the tests ready before the actual Firmware/Hardware availability is critical and helps in early time to market.
The proposed NVMe Protocol Simulator enables easy proto typing and shift left development of host test framework in parallel without dependency on hardware /firmware for new Features. This solution enables complete NVMe Subsystem support with up to 32 controllers. Also has Features FormatNVM, End-to-End Data protection by enabling metadata support, Simple Copy, Boot Partitions area which can be read by the host without enabling controller and queues. This also helps corner case scenarios & device limited scenarios like 64k IO Queues emulation. Samsung open source repo provides some of the additional optional features which are not available upstream. NVMe protocol simulator is based on QEMU has advantage of emulating the NVMe drive as PCIe Device, this will enable ‘As-Is’ usage of host test infra between Simulator and real SSD.
NVMe protocol simulator solution provides a cost effective way of early development of host test infrastructure and tests for new NVMe features. In addition, early detection of Errata during development of Simulator.
Zoned Namespace SSDs are SSDs that implement the ZNS command set as specified by the NVM Express organization. ZNS SSDs provide an interface to the host such that the host/applications can manage the data placement on these SSDs directly.
This presentation plans to cover the basics of ZNS SSDs and demonstrate the software stack through which MyRocks can be run on ZNS SSDs. MySQL integrated with RocksDB is called MyRocks, this presentation discusses about ZNS support in RocksDB and MySQL and evaluate the performance of MyRocks on a ZNS SSD viz-a-viz a conventional SSD.