Sorry, you need to enable JavaScript to visit this website.

5G Streaming Questions Answered

Michael Hoard

Dec 2, 2020

title of post

The broad adoption of 5G, internet of things (IOT) and edge computing are reshaping the nature and role of enterprise and cloud storage. Preparing for this significant disruption is important. It’s a topic the SNIA Cloud Storage Technologies Initiative covered in our recent webcast “Storage Implications at the Velocity of 5G Streaming,” where my colleagues, Steve Adams and Chip Maurer, took a deep dive into the 5G journey, streaming data and real-time edge AI, 5G use cases and much more. If you missed the webcast, it’s available on-demand along with a copy of the webcast slides.

As you might expect, this discussion generated some intriguing questions. As promised during the live presentation, our experts have answered them all here.

Q. What kind of transport do you see that is going to be used for those (5G) use-cases?

A. At a high level, 5G consists of 3 primary slices: enhanced mobile broadband (eMBB), ultra-low latency communications (URLLC) and massive machine type communication (mMTC). Each of these are better suited for different use cases, for example normal smartphone usage relies on eMBB, factory robotics relies on URLLC, and intelligent device or sensor applications like farming, edge computing and IOT relies on mMTC.  

The primary 5G standards-making bodies include:

  • The 3rd Generation Partnership Project (3GPP) – formulates 5G technical specifications which become 5G standards. Release 15 was the first release to define 5G implementations, and Release 16 is currently underway.
  • The Internet Engineering Task Force (IETF) partners with 3GPP on the development of 5G and new uses of the technology. Particularly, IETF develops key specifications for various functions enabling IP protocols to support network virtualization. For example, IETF is pioneering Service Function Chaining (SFC), which will link the virtualized components of the 5G architecture—such as the base station, serving gateway, and packet data gateway—into a single path. This will permit the dynamic creation and linkage of Virtual Network Functions (VNFs).
  • The International Telecommunication Union (ITU), based in Geneva, is the United Nations specialized agency focused on information and communication technologies. ITU World Radio communication conferences revise the international treaty governing the use of the radio-frequency spectrum and the geostationary and non-geostationary satellite orbits.

To learn more, see

Q. What if the data source at the Edge is not close to where the signal is good to connect to cloud? And, I wonder how these algorithm(s) / data streaming solutions should be considered?

A. When we look at a 5G applications like massive Machine Type Communications (mMTC), we expect many kinds of devices will connect only occasionally, e.g. battery-operated sensors attached to farming water sprinklers or water pumps.  Therefore, long distance, low bandwidth, sporadically connected 5G network applications will need to tolerate long stretches of no-contact without losing context or connectivity, as well as adapt to variations in signal strength and signal quality.   

Additionally, 5G supports three broad ranges of wireless frequency spectrum: Low, Mid and High. The lower frequency range provides lower bandwidth for broader or more wide area wireless coverage.  The higher frequency range provides higher bandwidth for limited area or more focused area wireless coverage. To learn more, check out The Wired Guide to 5G.

On the second part of the question regarding algorithm(s) / data streaming solutions, we anticipate streaming IOT data from sporadically connected devices can still be treated as steaming data sources from a data ingestion standpoint. It is likely to consist of broad snapshots (pre-stipulated time windows) with potential intervals of null sets of data when compared with other types of data sources. Streaming data, regardless of interval of data arrival, has value because of the “last known state” value versus previous interval known states. Calculation of trending data is one of the most common meaningful ways to extract value and make decisions. 

Q. Is there an improvement with the latency in 5G from cloud to data center?

By 2023, we should see the introduction of 5G ultra reliable low latency connection (URLLC) capabilities, which will increase the amount of time sensitive data ingested into and delivered from wireless access networks. This will increase demand for fronthaul and backhaul bandwidth to move time sensitive data from remote radio units, to baseband stations and aggregation points like metro area central offices.

As an example, to reduce latency, some hyperscalers have multiple connections out to regional co-location sites, central offices and in some cases sites near cell towers. To save on backhaul transport costs and improve 5G latency, some cloud service providers (CSP) are motivated to locate their networks as close to users as possible.

Independent of CSPs, we expect that backhaul bandwidth will increase to support the growth in wireless access bandwidth of 5G over 4G LTE. But it isn’t the only reason backhaul bandwidth is growing. COVID-19 revealed that many cable and fiber access networks were built to support much more download than upload traffic. The explosion in work and study from home, as well as video conferencing has changed the ratio of upload to download. So many wireline operators (which are often also wireless operators) are upgrading their backhaul capacity in anticipation that not everyone will go back to the office any time soon and some may hardly ever return to the office.

Q. Are the 5G speeds ensured from end-to-end (i.e from mobile device to tower and with MSP’s infrastructure)? Understand most of the MSPs have improved the low latency speeds between Device and Tower.

We expect specialized services like 5G ultra reliable low latency connection (URLLC) will help improve low latency and narrow jitter communications. As far as “assured,” this depends on the service provider SLA. More broadly 5G mobile broadband and massive machine type communications are typically best effort networks, so generally, there is no overall guaranteed or assured latency or jitter profile.

5G supports the largest range of radio frequencies. The high frequency range uses milli-meter (mm) wave signals to deliver the theoretical max of 10Gbps, which means by default reduced latency along with higher throughput. For more information on deterministic over-the-air network connections using 5G URLLC and TSN (Time Sensitive Networking), see this ITU presentation “Integration of 5G and TSN.”  

To provide a bit more detail, mobile devices communicate via wireless with Remote Radio Head (RRH) units co-located at the antenna tower site, while baseband unit (BBU) processing is typically hosted in local central offices.  The local connection between RRHs and BBUs is called the fronthaul network (from antennas to central office). Fronthaul networks are usually fiber optic supporting eCPRI7.2 protocol, which provide time sensitive network delivery. Therefore, this portion of the wireless data path is deterministic even if the over-the-air or other backhaul portions of the network are not.

Q. Do we use a lot of matrix calculations in streaming data, and do we have a circuit model for matrix calculations for convenience?

We see this applies case-by-case based on the type of data.  What we often see is many edge hardware systems include extensive GPU support to facilitate matrix calculations for real time analytics.

Q. How do you see the deployment and benefits of Hyperconverged Infrastructure (HCI) on the edge?

Great question.  The software flexibility of HCI can provide many advantages on the edge over dedicated hardware solutions. Ease of deployment, scalability and service provider support make HCI an attractive option.  See this very informative article from TechTarget “Why hyper-converged edge computing is coming into vogue” for more details.

Q. Can you comment on edge-AI accelerator usage and future potentials? What are the places these will be used?

Edge processing capabilities include many resources to improve AI capabilities.  Things like computational storage and increased use of GPUs will only serve to improve analytics performance. Here is a great article on this topic.

Q. How important is high availability (HA) for edge computing?

For most enterprises, edge computing reliability is mission critical.  Therefore, almost every edge processing solution we have seen includes complete and comprehensive HA capabilities.

Q. How do you see Computational Storage fitting into these Edge use cases?  Any recommendations on initial deployment targets?

The definition and maturity of computational storage is rapidly evolving and is targeted to offer huge benefits for management and scale of 5G data usage on distributed edge devices.  First and foremost, 5G data can be used to train deep neural networks at higher rates due to parallel operation of “in storage processing.”  Petabytes of data may be analyzed in storage devices or within storage enclosures (not moved over the network for analysis). Secondly, computational storage may also accelerate the process of conditioning data or filtering out unwanted data.

Q. Do you think that the QUIC protocol will be a standard for the 5G communication?

So far, TCP is still the dominate transport layer protocol within the industry. QUIC was initially proposed by Google and is widely adopted in the Chrome/Android ecosystem.  QUIC is getting increased interest and adoption due to its performance benefits and ease in implementation (it can be implemented in user space and does not need OS kernel changes). 

For more information, here is an informative SNIA presentation on the QUIC protocol.

Please note this is an active area of innovation.  There are other methods including Apple IOS devices using MPTCP, and for inter/intra data center communications RoCE (RDMA over Converged Ethernet) is also gaining traction, as it allows for direct memory access without consuming host CPU cycles.  We expect TCP/QUIC/RDMA will all co-exist, as other new L3/L4 protocols will continue to emerge for next generation workloads. The choice will depend on workloads, service requirements and system availability.

 

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Why Cloud Standards Matter

Alex McDonald

Nov 18, 2020

title of post
Effective cloud data management and interoperability is critical for organizations looking to gain control and security over their cloud usage in hybrid and multicloud environments. The Cloud Data Management Interface (CDMI™), also known as the ISO/IEC 17826 International Standard, is intended for application developers who are implementing or using cloud storage systems, and who are developing applications to manage and consume cloud storage. It specifies how to access cloud storage namespaces and how to interoperably manage the data stored in these namespaces. Standardizing the metadata that expresses the requirements for the data, leads to multiple clouds from different vendors treating your data the same. First published in 2010, the CDMI standard (ISO/IEC 17826:2016) is now at version 2.0 and will be the topic of our webcast on December 9, 2020, “Cloud Data Management & Interoperability: Why A CDMI Standard Matters,” where our experts, Mark Carlson, Co-chair of the SNIA Technical Council and Eric Hibbard, SNIA Storage Security Technical Work Group Chair, will provide an overview of the CDMI standard and cover CDMI 2.0:
  • Support for encrypted objects
  • Delegated access control
  • General clarifications
  • Errata contributed by vendors implementing the CDMI standard
This webcast will be live and Mark and Eric will be available to answer your questions on the spot. We hope to see you there. Register today.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

A New Wave of Video Analytics

Jim Fister

Nov 10, 2020

title of post

Adoption of cognitive services based on video and image analytics is on the rise. It’s an intriguing topic that the SNIA Cloud Storage Technologies Initiative will dive into on December 2, 2020 at our live webcast, “How Video Analytics is Changing the Way We Store Video.” In this webcast, we will look at some of the benefits and factors driving this adoption, as well as explore compelling projects and required components for a successful video-based cognitive service. This includes some great work in the open source community to provide methods and frameworks, some standards that are being worked on to unify the ecosystem and allow interoperability with models and architectures. Finally, we’ll cover the data required to train such models, the data source and how it needs to be treated.

 

As you might guess, there are challenges in how we do all of this. Many video archives are analog and tape based which doesn’t stand up well to mass ingestion or the back and forth of training algorithms. How can we start to define new architectures and leverage the right medium to make our archives accessible while still focusing on performance at the point of capture?

Join us for a discussion on:

  • New and interesting use cases driving adoption of video analytics as a cognitive service
  • Work in the open source arena on new frameworks and standards
  • Modernizing archives to enable training and refinement at will
  • Security and governance where personal identifiable information and privacy become a concern
  • Plugging into the rest of the ecosystem to build rich, video centric experiences for operations staff and consumers

Register today and bring your question for our experts who will be ready to answer them on the spot. We look forward to seeing you.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

NVMe Key-Value Standard Q&A

John Kim

Nov 9, 2020

title of post

Last month, Bill Martin, SNIA Technical Council Co-Chair, presented a detailed update on what’s happening in the development and deployment of the NVMe Key-Value standard. Bill explained where Key Value fits within an architecture, why it’s important, and the standards work that is being done between NVM Express and SNIA. The webcast was one of our highest rated. If you missed it, it’s available on-demand along with the webcast slides. Attendees at the live event had many great questions, which Bill Martin has answered here:

Q. Two of the most common KV storage mechanisms in use today are AWS S3 and RocksDB. How does NVMe KV standards align or differ from them? How difficult would it be to map between the APIs and semantics of those other technologies to NVMe KV devices?

A. KV Storage is intended as a storage layer that would support these and other object storage mechanisms. There is a publicly available KVRocks: RocksDB compatible key value store and MyRocks compatible storage engine designed for KV SSDs at GitHub. There is also a Ceph Object storage design available. These are example implementations that can help an implementer get to an efficient use of NVMe KV storage.

Q. At which layer will my app stack need to change to take advantage of KV storage?  Will VMware or Linux or Windows need to change at the driver level?  Or do the apps need to be changed to treat data differently?  If the apps don’t need to change doesn’t this then just take the data layout tables and move them up the stack in to the server?

A. The application stack needs to change at the point where it interfaces to a filesystem, where the interface would change from a filesystem interface to a KV storage interface. In order to take advantage of Key Value storage, the application itself may need to change, depending on what the current application interface is. If the application is talking to a RocksDB or similar interface, then the driver could simply be changed out to allow the app to talk directly to Key Value Storage. In this case, the application does not care about the API or the underlying storage. If the application is currently interfacing to a filesystem, then the application itself would indeed need to change and the KV API provides a standardized interface that multiple vendors can support to provide both the necessary libraries and access to a Key Value storage device. There will need to be changes in the OS to support this in providing a kernel layer driver for the NVMe KV device. If the application is using an existing driver stack that goes through a filesystem and does not change, then you cannot take advantage of KV Storage, but if the application changes or already has an object storage interface then the kernel filesystem and mapping functions can be removed from the data path.

Q. Is there a limit to the length of a key or value in the KV Architecture?

A.There are limits to the Key and value sizes in the current NVMe standard. The current implementation limits the key to 16 bytes due to a desire to pass the key within the NVMe command. The other architectural limit on a key is that the length of the key is specified in a field that allows up to 255 bytes for the key length. To utilize this, an alternative mechanism for passing the key to the device is necessary. For the value, the limit on the size is 4 GBytes.

Q. Are there any atomicity guarantees (e.g. for overwrites)?

A. The current specification makes it mandatory for atomicity at the KV level. In other words, if a KV Store command overwrites an existing KV pair and there is a power failure, you either get all of the original value or all of the new value.

Q. Is KV storage for a special class of storage called computational storage or can it be used for general purpose storage?

A. This is for any application that benefits from storing objects as opposed to storing blocks. This is unrelated to computational storage but may be of use in computational storage applications. One application that has been considered is that for a filesystem that rather than using the filesystem for storing blocks and having a mapping of each file handle to a set of blocks that contain the file contents, you would use KV storage where the file handle is the key and the object holds the file contents.

Q. What are the most frequently used devices to use the KV structure?

A. If what is being asked is, what are the devices that provide a KV structure, then the answer is, we expect the most common devices using the KV structure will be KV SSDs.

Q. Does the NVMe KV interface require 2 accessed in order to get the value (i.e., on access to get the value size in order to allocate the buffer and then a second access to read the value)?

A.If you know the size of the object or if you can pre-allocate enough space for your maximum size object then you can do a single access. This is no different than current implementations where you actually have to specify how much data you are retrieving from the storage device by specifying a starting LBA and a length. If you do not know the size of the value and require that in order to retrieve the value then you would indeed need to submit two commands to the NVMe KV storage device.

Q. Does the device know whether an object was compressed, and if not how can a previously compressed object be stored?

A. The hardware knows if it does compression automatically and therefore whether it should de-compress the object. If the storage device supports compression and the no-compress option, then the device will store metadata with the KV pair indicating if no-compress was specified when storing the file in order to return appropriate data. If the KV storage device does not perform compression, it can simply support storage and retrieval of previously compressed objects. If the KV storage device performs its own compression and is given a previously-compressed object to store and the no-compress option is not requested, the device will recompress the value (which typically won’t result in any space savings) or if the no-compress option is requested the device will store the value without attempting additional compression.

Q. On flash, erased blocks are fixed sizes, so how does Key Value handle defrag after a lot of writes and deletes?

A. This is implementation specific and depends on the size of the values that are stored. This is much more efficient on values that are approximately the size of the device’s erase block size as those values may be stored in an erase block and when deleted the erase block can be erased. For smaller values, an implementation would need to manage garbage collection as values are deleted and when appropriate move values that remain in a mostly empty erase block into a new erase block prior to erasing the erase block. This is no different than current garbage collection. The NVMe KV standard provides a mechanism for the device to report optimal value size to the host in order to better manage this as well.

Q. What about encryption?  Supported now or will there be SED versions of [key value] drives released down the road?

A. There is no reason that a product could not support encryption with the current definition of key value storage. The release of SED (self-encrypting drive) products is vendor specific.

Q. What are considered to be best use cases for this technology? And for those use cases - what's the expected performance improvement vs. current NVMe drives + software?

A. The initial use case is for database applications where the database is already storing key/value pairs. In this use case, experimentation has shown that a 6x performance improvement from RocksDB to a KV SSD implementing KV-Rocks is possible.

Q. Since writes are complete (value must be written altogether), does this mean values are restricted to NVMe's MDTS?

 A. Yes. Values are limited by MDTS (maximum data transfer size). A KV device may set this value to something greater than a block storage device does in order to support larger value sizes.

Q. How do protection scheme works with key-value (erasure coding/RAID/...)?

A. Since key value deals with complete values as opposed to blocks that make up a user data, RAID and erasure coding are usually not applicable to key value systems. The most appropriate data protection scheme for key value storage devices would be a mirrored scheme. If a storage solution performed erasure coding on data first, it could store the resulting EC fragments or symbols on key value SSDs.

Q. So Key Value is not something built on top of block like Object and NFS are?  Object and NFS data are still stored on disks that operate on sectors, so object and NFS are layers on top of block storage?  KV is drastically different, uses different drive firmware and drive layout?  Or do the drives still work the same and KV is another way of storing data on them alongside block, object, NFS?

A. Today, there is only one storage paradigm at the drive level -- block. Object and NFS are mechanisms in the host to map data models onto block storage. Key Value storage is a mechanism for the storage device to map from an address (a key) to a physical location where the value is stored, avoiding a translation in the host from the Key/value pair to a set of block addresses which are then mapped to physical locations where data is then stored. A device may have one namespace that stores blocks and another namespace that stores key value pairs. There is not a difference in the low-level storage mechanism only in the mapping process from address to physical location. Another difference from block storage is that the value stored is not a fixed size.

Q. Could you explain more about how tx/s is increased with KV?

A. The increase in transfers/second occurs for two reasons: one is because the translation layer in the host from key/value to block storage is removed; the second is that the commands over the bus are reduced to a single transfer for the entire key value pair. The latency savings from this second reduction is less significant than the savings from removing translation operations that have to happen in the host.

Keep up-to-date on work SNIA is doing on the Key Value Storage API Specification at the SNIA website.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

NVMe Key-Value Standard Q&A

John Kim

Nov 9, 2020

title of post
Last month, Bill Martin, SNIA Technical Council Co-Chair, presented a detailed update on what’s happening in the development and deployment of the NVMe Key-Value standard. Bill explained where Key Value fits within an architecture, why it’s important, and the standards work that is being done between NVM Express and SNIA. The webcast was one of our highest rated. If you missed it, it’s available on-demand along with the webcast slides. Attendees at the live event had many great questions, which Bill Martin has answered here: Q. Two of the most common KV storage mechanisms in use today are AWS S3 and RocksDB. How does NVMe KV standards align or differ from them? How difficult would it be to map between the APIs and semantics of those other technologies to NVMe KV devices? A. KV Storage is intended as a storage layer that would support these and other object storage mechanisms. There is a publicly available KVRocks: RocksDB compatible key value store and MyRocks compatible storage engine designed for KV SSDs at GitHub. There is also a Ceph Object storage design available. These are example implementations that can help an implementer get to an efficient use of NVMe KV storage. Q. At which layer will my app stack need to change to take advantage of KV storage?  Will VMware or Linux or Windows need to change at the driver level?  Or do the apps need to be changed to treat data differently?  If the apps don’t need to change doesn’t this then just take the data layout tables and move them up the stack in to the server? A. The application stack needs to change at the point where it interfaces to a filesystem, where the interface would change from a filesystem interface to a KV storage interface. In order to take advantage of Key Value storage, the application itself may need to change, depending on what the current application interface is. If the application is talking to a RocksDB or similar interface, then the driver could simply be changed out to allow the app to talk directly to Key Value Storage. In this case, the application does not care about the API or the underlying storage. If the application is currently interfacing to a filesystem, then the application itself would indeed need to change and the KV API provides a standardized interface that multiple vendors can support to provide both the necessary libraries and access to a Key Value storage device. There will need to be changes in the OS to support this in providing a kernel layer driver for the NVMe KV device. If the application is using an existing driver stack that goes through a filesystem and does not change, then you cannot take advantage of KV Storage, but if the application changes or already has an object storage interface then the kernel filesystem and mapping functions can be removed from the data path. Q. Is there a limit to the length of a key or value in the KV Architecture? A.There are limits to the Key and value sizes in the current NVMe standard. The current implementation limits the key to 16 bytes due to a desire to pass the key within the NVMe command. The other architectural limit on a key is that the length of the key is specified in a field that allows up to 255 bytes for the key length. To utilize this, an alternative mechanism for passing the key to the device is necessary. For the value, the limit on the size is 4 GBytes. Q. Are there any atomicity guarantees (e.g. for overwrites)? A. The current specification makes it mandatory for atomicity at the KV level. In other words, if a KV Store command overwrites an existing KV pair and there is a power failure, you either get all of the original value or all of the new value. Q. Is KV storage for a special class of storage called computational storage or can it be used for general purpose storage? A. This is for any application that benefits from storing objects as opposed to storing blocks. This is unrelated to computational storage but may be of use in computational storage applications. One application that has been considered is that for a filesystem that rather than using the filesystem for storing blocks and having a mapping of each file handle to a set of blocks that contain the file contents, you would use KV storage where the file handle is the key and the object holds the file contents. Q. What are the most frequently used devices to use the KV structure? A. If what is being asked is, what are the devices that provide a KV structure, then the answer is, we expect the most common devices using the KV structure will be KV SSDs. Q. Does the NVMe KV interface require 2 accessed in order to get the value (i.e., on access to get the value size in order to allocate the buffer and then a second access to read the value)? A.If you know the size of the object or if you can pre-allocate enough space for your maximum size object then you can do a single access. This is no different than current implementations where you actually have to specify how much data you are retrieving from the storage device by specifying a starting LBA and a length. If you do not know the size of the value and require that in order to retrieve the value then you would indeed need to submit two commands to the NVMe KV storage device. Q. Does the device know whether an object was compressed, and if not how can a previously compressed object be stored? A. The hardware knows if it does compression automatically and therefore whether it should de-compress the object. If the storage device supports compression and the no-compress option, then the device will store metadata with the KV pair indicating if no-compress was specified when storing the file in order to return appropriate data. If the KV storage device does not perform compression, it can simply support storage and retrieval of previously compressed objects. If the KV storage device performs its own compression and is given a previously-compressed object to store and the no-compress option is not requested, the device will recompress the value (which typically won’t result in any space savings) or if the no-compress option is requested the device will store the value without attempting additional compression. Q. On flash, erased blocks are fixed sizes, so how does Key Value handle defrag after a lot of writes and deletes? A. This is implementation specific and depends on the size of the values that are stored. This is much more efficient on values that are approximately the size of the device’s erase block size as those values may be stored in an erase block and when deleted the erase block can be erased. For smaller values, an implementation would need to manage garbage collection as values are deleted and when appropriate move values that remain in a mostly empty erase block into a new erase block prior to erasing the erase block. This is no different than current garbage collection. The NVMe KV standard provides a mechanism for the device to report optimal value size to the host in order to better manage this as well. Q. What about encryption?  Supported now or will there be SED versions of [key value] drives released down the road? A. There is no reason that a product could not support encryption with the current definition of key value storage. The release of SED (self-encrypting drive) products is vendor specific. Q. What are considered to be best use cases for this technology? And for those use cases – what’s the expected performance improvement vs. current NVMe drives + software? A. The initial use case is for database applications where the database is already storing key/value pairs. In this use case, experimentation has shown that a 6x performance improvement from RocksDB to a KV SSD implementing KV-Rocks is possible. Q. Since writes are complete (value must be written altogether), does this mean values are restricted to NVMe’s MDTS? A. Yes. Values are limited by MDTS (maximum data transfer size). A KV device may set this value to something greater than a block storage device does in order to support larger value sizes. Q. How do protection scheme works with key-value (erasure coding/RAID/…)? A. Since key value deals with complete values as opposed to blocks that make up a user data, RAID and erasure coding are usually not applicable to key value systems. The most appropriate data protection scheme for key value storage devices would be a mirrored scheme. If a storage solution performed erasure coding on data first, it could store the resulting EC fragments or symbols on key value SSDs. Q. So Key Value is not something built on top of block like Object and NFS are?  Object and NFS data are still stored on disks that operate on sectors, so object and NFS are layers on top of block storage?  KV is drastically different, uses different drive firmware and drive layout?  Or do the drives still work the same and KV is another way of storing data on them alongside block, object, NFS? A. Today, there is only one storage paradigm at the drive level — block. Object and NFS are mechanisms in the host to map data models onto block storage. Key Value storage is a mechanism for the storage device to map from an address (a key) to a physical location where the value is stored, avoiding a translation in the host from the Key/value pair to a set of block addresses which are then mapped to physical locations where data is then stored. A device may have one namespace that stores blocks and another namespace that stores key value pairs. There is not a difference in the low-level storage mechanism only in the mapping process from address to physical location. Another difference from block storage is that the value stored is not a fixed size. Q. Could you explain more about how tx/s is increased with KV? A. The increase in transfers/second occurs for two reasons: one is because the translation layer in the host from key/value to block storage is removed; the second is that the commands over the bus are reduced to a single transfer for the entire key value pair. The latency savings from this second reduction is less significant than the savings from removing translation operations that have to happen in the host. Keep up-to-date on work SNIA is doing on the Key Value Storage API Specification at the SNIA website.

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

A SNIA Superpower: PAS submitter to ISO

Linda Capcara

Oct 21, 2020

title of post
SNIA’s Technical Council is one of the crown jewels of the organization. Made up of a group of acknowledged storage experts, the Technical Council oversees and manages SNIA Technical Work Groups, reviews architectures submitted by work groups, and is SNIA’s technical liaison to standards organizations. One of the Council’s superpowers is its ISO JTC-1 designation as an ARO and a PAS submitter. What does that actually mean? It’s a very big deal! SNIA is only one of 13 organizations worldwide that have the PAS submission capability, putting it in exclusive company. The list includes: “Traditionally, ISO standards can only reference one another. By approving SNIA as an Approved Reference Organization (ARO), JTC1 is acknowledging SNIA’s rigorous development process and technical credibility. This allows the documents that SNIA develops to be used to underpin other ISO standards,” said Arnold Jones, Technical Council Managing Director, SNIA. “In addition, SNIA has satisfied the extensive criteria to become a Publicly Available Standard (PAS) submitter.” ISO is an independent, international organization with a membership of 165 national standards bodies. It is the international standards organization. When a SNIA standard becomes ISO-approved, virtually every country in the world has access to it along with confidence it will work well with solutions in the marketplace. ISO/IEC JTC 1 is a joint technical committee of ISO and the International Electrotechnical Commission (IEC). Its purpose is to develop, maintain and promote standards in the fields of information technology and Information and Communications Technology. Specifications developed by a PAS Submitter can use a streamlined approval process within ISO, assuring that recent, advanced technology moves from industry consensus to international standard as quickly and efficiently as possible. SNIA was reaffirmed as a PAS submitter by ISO in 2018 for another five-year term, with its status in effect until September 2023. SMI-S Storage Management Specification The Storage Management Initiative-Specification (SMI-S) provides a real-world example of how a PAS submission works within SNIA. SMI-S was first approved as an ISO standard in 2002. Today, it has been implemented in over 1,350 storage products that provides access to common storage management functions and features. During its lifetime, the SMI-S standard has been approved by ISO many times. The current international standard for SMI-S was based on SMI-S v1.5, which was completed in 2011, submitted for ISO approval in 2012, and formally adopted in 2014 as the latest revision of ISO/IEC 24775. SMI-S 1.8 was recently sent to ISO as an update to ISO/IEC 24775. SNIA believes SMI-S 1.8 rev 5 is the very best version of the specification and should be adopted worldwide. As a PAS submission, it will become an international standard much more quickly. Published by SNIA as a standard in March, 2020, it was submitted to ISO in May and began a 90-day ballot in August. If all goes as expected, SMI-S v1.8 will be approved as ISO/IEC 24775:2020 by the end of the year – less than a year after its publication as a SNIA architecture. Subscribe to the SNIA Matters Newsletter here to stay up-to-date on all SNIA announcements and be one of the first to learn the status of the SMI-S 1.8 rev 5 storage specification. Want to learn more about ISO and PAS? You might find the following links useful:

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Ilker Cebeli

Oct 20, 2020

title of post

Everyone is looking to squeeze more efficiency from storage. That’s why the

SNIA Networking Storage Forum hosted a live webcast last month “Compression: Putting the Squeeze on Storage.” The audience asked many great questions on compression techniques. Here are answers from our expert presenters, John Kim and Brian Will:

Q. When multiple unrelated entities are likely to compress the data, how do they understand that the data is already compressed and so skip the compression?

A. Often they can tell from the file extension or header that the file has already been compressed. Otherwise each entity that wants to compress the data will try to compress it and then discard the results if it makes the file larger (because it was already compressed). 

Q. I’m curious about storage efficiency of data reduction techniques (compression/ thin provisioning etc) on certain database/server workloads which end up being more of a hindrance. Ex: Oracle ASM, which does not perform very well under any form of storage efficiency method. In such scenarios, what would be the recommendation to ensure storage is judiciously utilized?

A. Compression works well for some databases but not others, depending both on how much data repetition occurs within the database and how the database tables are structured. Database compression can be done on the row, column or page level, depending on the method and the database structure. Thin provisioning generally works best if multiple applications using the storage system (such as the database application) want to reserve or allocate more space than it actually needs. If your database system does not like the use of external (storage-based, OS-based, or file system-based) space efficiency techniques, you should check if it supports its own internal compression options.

Q. What is a DPU?

A. A DPU is a data processing unit that specializes in moving, analyzing and processing data as it moves in and out of servers, storage, or other devices. DPUs usually combine network interface card (NIC) functionality with programmable CPU and/or FPGA cores. Some possible DPU functions include packet forwarding, encryption/decryption, data compression/decompression, storage virtualization/acceleration, executing SDN policies, running a firewall agent, etc. 

Q. What's the difference between compression versus compaction?

A. Compression replaces repeated data with either shorter symbols or pointers that represent the original data but take up less space. Compaction eliminates empty space between blocks or inside of files, often by moving real data closer together. For example, if you store multiple 4KB chunks of data in a storage system that uses 32KB blocks, the default storage solution might consume one 32KB storage block for each 4KB of data. Compaction could put 5 to 8 of those 4KB data chunks into one 32KB storage block to recover wasted free space.

Q. Is data encryption at odds with data compression?  That is, is data encryption a problem for data compression?

A.If you encrypt data first, it usually makes compression of the encrypted data difficult or impossible, depending on the encryption algorithm. (A simple substitution cypher would still allow compression but wouldn't be very secure.) In most cases, the answer is to first compress the data then encrypt it. Going the other way, the reverse process is to first decrypt the data then decompress it.

Q. How do we choose the binary form code 00, 01, 101, 110, etc?

A. These will be used as the final symbol representations written into the output data stream. The table represented in the presentation is only illustrative, the algorithm document in the deflate RFC is a complete algorithm to represent symbols in a compacted binary form.

Q. Is there a resource for different algorithms vs CPU requirements vs compression ratios?

A. A good resource to see the cost versus ratio trade-offs with different algorithms is on GItHub here. This utility covers a wide range of compression algorithms, implementations and levels. The data shown on their GitHub location is benchmarked against the silesia corpus which represents a number of different data sets.

Q. Do these operations occur on individual data blocks, or is this across the entire compression job?

A. Assuming you mean the compression operations, it typically occurs across multiple data blocks in the compression window. The compression window almost always spans more than one data block but usually does not span the entire file or disk/SSD, unless it's a small file.

Q. How do we guarantee that important information is not lost during the lossy compression?

A. Lossy compression is not my current area of expertise but there is a significant area of information theory called Rate-distortion theory which is used for quantification of images for compression, that may be of interest. In addition, lossy compression is typically only used for files/data where it's known the users of that data can tolerate the data loss, such as images or video. The user or application can typically adjust the compression ratio to ensure an acceptable level of data loss.

Q. Do you see any advantage in performing the compression on the same CPU controller that is managing the flash (running the FTL, etc.)?

A.There may be cache benefits from running compression and flash on the same CPU depending on the size of transactions. If the CPU is on the SSD controller itself, running compression there could offload the work from the main system CPU, allowing it to spend more cycles running applications instead of doing compression/decompression.

Q. Before compressing data, is there a method to check if the data is good to be compressed?

A.Some compression systems can run a quick scan of a file to estimate the likely compression ratio. Other systems look at the extension and/or header of the file and skip attempts to compress it if it looks like it's already compressed, such as most image and video files. Another solution is to actually attempt to compress the file and then discard the compressed version if it's larger than the original file.

Q. If we were to compress on a storage device (SSD) what do you think are the topic challenges? Error propagation? Latency/QoS or other?

A. Compressing on a storage device could mean higher latency for the storage device, both when writing files (if compression is inline) or when reading files back (as they are decompressed). But it's likely this latency would otherwise exist somewhere else in the system if the files were being compressed and decompressed somewhere other than on the storage device. Compressing (and decompressing) on the storage device means the data will be transmitted to (and from) the storage while uncompressed, which could consume more bandwidth. If an SSD is doing post compression (i.e. compression after the file is stored and not inline as the file is being stored), it would likely cause more wear on the SSD because each file is written twice.

Q. Are all these CPU-based compression analyses?

A. Yes these are CPU-based compression analyses.

Q. Can you please characterize the performance difference between, say LZ4 and Deflate in terms of microseconds or nanoseconds?

A. Extrapolating from the data available here, an 8KB request using LZ4 fast level 3 (lz4fast 1.9.2 -3) would take 9.78 usec for compression and 1.85 usec for decompression. While using zlib level 1 for an 8KB request compression takes 68.8 usec while decompression will take 21.39 usec. Another aspect to note it that at while LZ4 fast level 3 takes significantly less time, the compression ratio is 50.52% while zlib level 1 is 36.45%, showing that better compression ratios can have a significant cost.

Q. How important is the compression ratio when you are using specialty products?

A. The compression ratio is a very important result for any compression algorithm or implementation.

Q. In slide #15, how do we choose the binary code form for the characters?

A. The binary code form in this example is entirely controlled by the frequency of occurrence of the symbol within the data stream. The higher the symbol frequency the shorter the binary code assigned. The algorithm used here is just for illustrative purposes and would not be used (at least in this manner) in a standard. Huffman Encoding in DEFLATE. Here is a good example of a defined encoding algorithm.

This webcast was part of a SNIA NSF series on data reduction. Please check out the other two sessions:

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Ilker Cebeli

Oct 20, 2020

title of post
Everyone is looking to squeeze more efficiency from storage. That’s why the SNIA Networking Storage Forum hosted a live webcast last month “Compression: Putting the Squeeze on Storage.” The audience asked many great questions on compression techniques. Here are answers from our expert presenters, John Kim and Brian Will: Q. When multiple unrelated entities are likely to compress the data, how do they understand that the data is already compressed and so skip the compression? A. Often they can tell from the file extension or header that the file has already been compressed. Otherwise each entity that wants to compress the data will try to compress it and then discard the results if it makes the file larger (because it was already compressed). Q. I’m curious about storage efficiency of data reduction techniques (compression/ thin provisioning etc) on certain database/server workloads which end up being more of a hindrance. Ex: Oracle ASM, which does not perform very well under any form of storage efficiency method. In such scenarios, what would be the recommendation to ensure storage is judiciously utilized? A. Compression works well for some databases but not others, depending both on how much data repetition occurs within the database and how the database tables are structured. Database compression can be done on the row, column or page level, depending on the method and the database structure. Thin provisioning generally works best if multiple applications using the storage system (such as the database application) want to reserve or allocate more space than it actually needs. If your database system does not like the use of external (storage-based, OS-based, or file system-based) space efficiency techniques, you should check if it supports its own internal compression options. Q. What is a DPU? A. A DPU is a data processing unit that specializes in moving, analyzing and processing data as it moves in and out of servers, storage, or other devices. DPUs usually combine network interface card (NIC) functionality with programmable CPU and/or FPGA cores. Some possible DPU functions include packet forwarding, encryption/decryption, data compression/decompression, storage virtualization/acceleration, executing SDN policies, running a firewall agent, etc. Q. What’s the difference between compression versus compaction? A. Compression replaces repeated data with either shorter symbols or pointers that represent the original data but take up less space. Compaction eliminates empty space between blocks or inside of files, often by moving real data closer together. For example, if you store multiple 4KB chunks of data in a storage system that uses 32KB blocks, the default storage solution might consume one 32KB storage block for each 4KB of data. Compaction could put 5 to 8 of those 4KB data chunks into one 32KB storage block to recover wasted free space. Q. Is data encryption at odds with data compression?  That is, is data encryption a problem for data compression? A.If you encrypt data first, it usually makes compression of the encrypted data difficult or impossible, depending on the encryption algorithm. (A simple substitution cypher would still allow compression but wouldn’t be very secure.) In most cases, the answer is to first compress the data then encrypt it. Going the other way, the reverse process is to first decrypt the data then decompress it. Q. How do we choose the binary form code 00, 01, 101, 110, etc? A. These will be used as the final symbol representations written into the output data stream. The table represented in the presentation is only illustrative, the algorithm document in the deflate RFC is a complete algorithm to represent symbols in a compacted binary form. Q. Is there a resource for different algorithms vs CPU requirements vs compression ratios? A. A good resource to see the cost versus ratio trade-offs with different algorithms is on GItHub here. This utility covers a wide range of compression algorithms, implementations and levels. The data shown on their GitHub location is benchmarked against the silesia corpus which represents a number of different data sets. Q. Do these operations occur on individual data blocks, or is this across the entire compression job? A. Assuming you mean the compression operations, it typically occurs across multiple data blocks in the compression window. The compression window almost always spans more than one data block but usually does not span the entire file or disk/SSD, unless it’s a small file. Q. How do we guarantee that important information is not lost during the lossy compression? A. Lossy compression is not my current area of expertise but there is a significant area of information theory called Rate-distortion theory which is used for quantification of images for compression, that may be of interest. In addition, lossy compression is typically only used for files/data where it’s known the users of that data can tolerate the data loss, such as images or video. The user or application can typically adjust the compression ratio to ensure an acceptable level of data loss. Q. Do you see any advantage in performing the compression on the same CPU controller that is managing the flash (running the FTL, etc.)? A.There may be cache benefits from running compression and flash on the same CPU depending on the size of transactions. If the CPU is on the SSD controller itself, running compression there could offload the work from the main system CPU, allowing it to spend more cycles running applications instead of doing compression/decompression. Q. Before compressing data, is there a method to check if the data is good to be compressed? A.Some compression systems can run a quick scan of a file to estimate the likely compression ratio. Other systems look at the extension and/or header of the file and skip attempts to compress it if it looks like it’s already compressed, such as most image and video files. Another solution is to actually attempt to compress the file and then discard the compressed version if it’s larger than the original file. Q. If we were to compress on a storage device (SSD) what do you think are the topic challenges? Error propagation? Latency/QoS or other? A. Compressing on a storage device could mean higher latency for the storage device, both when writing files (if compression is inline) or when reading files back (as they are decompressed). But it’s likely this latency would otherwise exist somewhere else in the system if the files were being compressed and decompressed somewhere other than on the storage device. Compressing (and decompressing) on the storage device means the data will be transmitted to (and from) the storage while uncompressed, which could consume more bandwidth. If an SSD is doing post compression (i.e. compression after the file is stored and not inline as the file is being stored), it would likely cause more wear on the SSD because each file is written twice. Q. Are all these CPU-based compression analyses? A. Yes these are CPU-based compression analyses. Q. Can you please characterize the performance difference between, say LZ4 and Deflate in terms of microseconds or nanoseconds? A. Extrapolating from the data available here, an 8KB request using LZ4 fast level 3 (lz4fast 1.9.2 -3) would take 9.78 usec for compression and 1.85 usec for decompression. While using zlib level 1 for an 8KB request compression takes 68.8 usec while decompression will take 21.39 usec. Another aspect to note it that at while LZ4 fast level 3 takes significantly less time, the compression ratio is 50.52% while zlib level 1 is 36.45%, showing that better compression ratios can have a significant cost. Q. How important is the compression ratio when you are using specialty products? A. The compression ratio is a very important result for any compression algorithm or implementation. Q. In slide #15, how do we choose the binary code form for the characters? A. The binary code form in this example is entirely controlled by the frequency of occurrence of the symbol within the data stream. The higher the symbol frequency the shorter the binary code assigned. The algorithm used here is just for illustrative purposes and would not be used (at least in this manner) in a standard. Huffman Encoding in DEFLATE. Here is a good example of a defined encoding algorithm. This webcast was part of a SNIA NSF series on data reduction. Please check out the other two sessions:

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Answering Your Questions on EDSFF

Jonmichael Hands

Oct 19, 2020

title of post
We had a tremendous response to our webcast asking if we were truly at the end of the 2.5-inch disk era. SNIA Compute, Memory, and Storage Initiative SSD Special Interest Group brought together experts from Dell, Facebook, HPE, JEDEC, KIOXIA, Lenovo, and Microsoft in a lively follow on to the Enterprise and Data Center SSD Form Factor (EDSFF) May 2020 discussions at OCP Summit,. If you missed our live webcast – watch it on demand. Webcast attendees raised a variety of questions.  Our experts provide answers to them here: Q:  SFF_TA_1006 suggests E1.S can support max 25W for 25mm asymmetric heat-sink. What are the air-flow assumptions for this estimate? Are there any thermal models and test guidelines available for EDSFF form-factors? A:  Yes! The SNIA SFF TA TWG has a new project on Enterprise Device Form Factor Thermal Requirements. There was a great presentation about this at the OCP 2020 Virtual Summit. Registered attendees can log in to view. Q:  When is the transition form U.2 to E3 expected to occur? A:  We expect a few products for PCIe 4.0 but a wide industry transition at PCIe 5.0 Q:  No one is mentioning dual-path.  Is this an Enterprise requirement and does EDSFF connector support this? A:  EDSFF does support dual port and enterprise storage high availability applications, with a dual port enable pin, if the SSD vendor supports it. Q:  What is the future vision as to fault tolerance options (RAID, mirroring, etc) for these devices in data center use cases in a way that doesn’t compromise or bottleneck their performance? HW raid controllers could bottleneck on PCIe bandwidth as one speaker mentioned, and SW RAID or virtual SAN solution overhead + application workloads could bottleneck on CPU before you’re able to max out storage performance (especially at these extreme density possibilities, wow!). Thanks! A:  SNIA just did a webcast on NVMe RAID!! See it in our Educational Library: RAID on CPU RAID for NVMe SSDs without a RAID Controller Card Q:  For open frame drives, is a Bezel required for EMI containment? A:  Yes, a latch is expected to have some EMI design Q:  Do all EDSFF form-factors support hot-plugging? Can EDSFF physical form-factors support all capabilities defined in NVMe (storage) specification? A:  Yes, hot plug is expected on EDSFF drives and is included in the SFF-TA-1009 pinout spec for support (interface and presence detect pins). Q:  Can a 2U system plug 2xE1.S into the same bay as 1xE3.S can be inserted? A:  This would require a custom PCB or carrier card. A single E1.S is electrically compatible with an E3 slot. Q:  If the ‘1’ in E1.L and E1.S means ‘1 Rack Unit’ what does the ‘3’ in the E3 family means since it is compatible with 1U and 2U? A:  Correct, E3 was originally for “3in media device” intended to be optimal for 2U vertically oriented, or a 1U horizontal. Q:  Can you estimate the shipment volume share for each form factors? Especially I’d like to know 70W E3.L shipment units forecast. A: As an example, Intel has publicly stated that they expect up to 35% of the data center SSD revenue total market to be on EDSFF by 2024. Analysts like Forward Insights have detailed EDSFF and SSD form factor market analysis. Q:  No one is talking about shock and vibe for this connector.  Is this an exercise for the carrier designer? A:  Mechanical specifications are expected to be similar to U.2 drives today. Q:  With the 2.5″ challenges, and migration to other form factors, what about client systems? It would seem the x4 interfaces would work just fine on a desktop, and the slimmest E1.S on a laptop for better heat dissipation. A:  Many people think M.2 will not be adequate for PCIe 5.0 speeds and power, and companies are looking at reusing E1.S or another client EDSFF variant with the same connector as a successor for desktops and workstations. Q:  Is there a detailed standard of SSDs configuration with different device types, and once there is an error, could SSDs be combined to solve the error problem? A:  SSD data protection is generally solved with RAID, software mirroring, or erasure code. Q:  When will volume servers support the new form factors/connectors? A:  We already see production servers for EDSFF from Supermicro, Wiwynn, AIC, and announcements from Lenovo and others launching soon! Q:  Is U.3 dead? A:   U.3 means conforming to the SFF-TA-1001 link definition specification for the SFF-8639 connector.  It isn’t a form factor definition like EDSFF, and it applies to U.2 (2.5” form factor) drives used with a Tri-mode HBA and properly enabled system backplane.  Read more about U.3 with this article. Intel and other suppliers are transitioning from U.2 & M.2 directly to EDSFF, bypassing support for U.3.  Some companies are supporting U.3 technology through servers, HBAs, and SSDs.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

An FAQ on RAID on the CPU

Paul Talbut

Oct 15, 2020

title of post

A few weeks ago, SNIA EMEA hosted a webcast to introduce the concept of RAID on CPU. The invited experts, Fausto Vaninetti from Cisco, and Igor Konopko from Intel, provided fascinating insights into this exciting new technology.

The webcast created a huge amount of interest and generated a host of follow-up questions which our experts have addressed below. If you missed the live event “RAID on CPU: RAID for NVMe SSDs without a RAID Controller Card” you can watch it on-demand.

Q. Why not RAID 6?

A. RAID on CPU is a new technology. Current support is for the most-used RAID levels for now, considering this is for servers not disk arrays. RAID 5 is primary parity RAID level for NVMe with 1 drive failure due to lower AFRs and faster rebuilds.

Q. Is the XOR for RAID 5 done in Software?

A.Yes, it is done in software on some cores of the Xeon CPU.

Q. Which generation of Intel CPUs support VROC?

A. All Intel Xeon Scalable Processors, starting with Generation 1 and continuing through with new CPU launches support VROC.

Q. How much CPU performance is used by the VROC implementation?

A. It depends on OS, Workload, and RAID Level. In Linux, Intel VROC is a kernel storage stack and not directly tied to specific cores, allowing it to scale based on the IO demand to the storage subsystem. This allows performance to scale as the number of NVMe drives attached increases. Under lighter workloads, Intel VROC and HBAs have similar CPU consumption. Under heavier workloads, the CPU consumption increases for Intel VROC, but so does the performance (IOPS/bandwidth), while the HBA hits a bottleneck (i.e. limited scaling). In Windows, CPU consumption can be higher, and performance does not scale as well due to differences in the storage stack implementation.

Q. Why do we only see VROC on Supermicro & Intel servers? The others don't have technology or have they preferred not to implement it?

A. This is not correct. There are more vendors supporting VROC than Supermicro and Intel itself. For example, Cisco is fully behind this technology and has a key-less implementation across its UCS B and C portfolio. New designs with VROC are typically tied to new CPU/platform launches from Intel, so keep an eye on your preferred platform providers as new platforms are launched.

Q. Are there plans from VROC to have an NVMe Target Implementation to connect external hosts?

A. Yes, VROC can be included in an NVMe-oF target. While not the primary use case for VROC, it will work. We are exploring this with customers to understand gaps and additional features to make VROC a better fit.

Q. Are there limitations for dual CPU configurations or must the VROC be configured for single CPU?

A. VROC can be enabled on dual CPU servers as well as single CPU servers. The consideration to keep in mind is that a RAID volume spanning multiple CPUs could see reduced performance, so it is not recommended if it can be avoided.

Q. I suggest having a key or explaining what X16 PCIe means in diagrams? It does mean the memory, right?

A. PCIe x16 indicates the specific PCIe bus implementation with 16 lanes.

Q. Do you have maximum performance results (IOPS, random read) of VROC on 24 NMVe devices?

A. This webinar presented some performance results. If more is needed, please contact your server vendor. Some additional performance results can be found at www.intel.com/vroc in the Support and Documentation Section at the bottom.

Q. I see "tps" and "ops" and "IOPs" within the presentation.  Are they all the same?  Transactions Per Second = Operations Per Second = I/O operations per second?

A. No, they are not the same. I/O operations are closer to storage concepts, transactions per second are closer to application concepts.

Q. I see the performance of Random Read is not scaling 4 times (2.5M) of pass-thru in case of windows (952K), whereas in Linux it is scaling (2.5M). What could be the reason for such low performance?

A. Due to differences in the way operating systems work, Linux is offering the best performance so far.

Q. Is there an example of using VROC for VMware (ESXi)?

A. VROC RAID is not supported for ESXi, but VMD is supported for robust attachment of NVMe SSDs with hot-plug/LED

Q. How do you protect RAID-1 data integrity if a power loss happened after only one drive is updated?

A. For RAID1 schema, you can read your data when a single drive is written. With RAID5 you need multiple drives available to rebuild your data.

Q. Where can I learn more about the VROC IC option?

A. Contact your server vendor or Intel representatives.

Q. In the last two slides, the MySQL and MongoDB configs, is the OS / boot protected? Is Boot on the SSDs or other drive(s)?

A. In this case boot was on a separate device and was not protected, but only because this was a test server. Intel VROC does support bootable RAID, so RAID1 redundant boot can be applied to the OS. This means on 1 platform, Intel VROC can support RAID1 for boot and other separate data RAID sets.

Q. Does this VROC need separate OS Drivers or do they have inbox support (for both Linux and Windows)?

A. There is inbox support in Linux and to get the latest features, the recommendation remains to use latest available OS releases. In some cases, a Linux OS Driver is provided for older OS releases to backport. In Windows, everything is delivered through OS Driver package.

Q. 1. Is Intel VMD a 'hardware' feature to newer XEON chips?  2. If VMD is software, can it be installed into existing servers?  3. If VMD is on a server today, can VROC be added to an existing server?

A. VMD is a prerequisite for VROC and is a hardware feature of the CPU along with relevant UEFI and OS drivers. VMD is possible on Intel Xeon Scalable Processors, but it also needs to be enabled by the server's motherboard and its firmware. It’s best to talk to your server vendor.

Q. In traditional spinning rust RAID, drive failure is essentially random (chance increases based on power on hours); with SSDs, failure is not mechanical and is ultimately based on lifetime utilization/NAND cells wearing out. How does VROC or RAID on CPU in general handle wear leveling to ensure that a given disk group doesn't experience multiple SSD failures at roughly the same time?

A. In general server vendors have a way to show the wear level for supported SSDs and that can help in this respect.

Q. Any reasons for not using caching on Optane memory instead of Optane SSD?

A. Using Optane Persistent Memory Module is a use case that will be expanded to over time. The current caching implementation requires a block device, so using an Intel SSD was the more direct use case.

Q. Wouldn't the need to add 2x Optane drives negate the economic benefit of VROC vs hardware RAID?

A. It depends on use cases. Clearly there is a cost associated to adding Optane in the mix. In some cases, only 2x 100GB Intel Optane SSDs are needed, which is still economical.

Q. Does VROC require Platinum processor?  Does Gold/Silver processors support VROC?

A. Intel VROC & VMD are supported across the Intel Xeon Scalable Processor product skus (bronze-platinum) as well as other product families such as Intel Xeon-D and Intel Xeon-W.

Q. Which NVMe spec is VROC complying to?

A. NVMe 1.4

Q. WHC is disabled by default. When should it be enabled? After write fault happened? or enabled before IO operation?


A. WHC should be enabled before you start writing data to your volumes. It can be required for critical data where data corrupted cannot be tolerated in any circumstance.

Q. Which vendors offer Intel VROC with their systems?

A. Multiple vendors as of today, but the specifics of implementation, licensing and integrated management options may differ.

Q. Is VROC available today?

A. Yes, launched in 2017

Q. Is there a difference in performance between the processor categories? Platinum, gold and Silver have the same benefits?

A. Different processor categories have performance differences by themself. VROC is not different across those CPUs.

Q. In a dual CPU config and there is an issue with the VMD on one processor, is there any protection?

A. This depends on how the devices are connected. SSDs could be connected to different VMDs and in RAID1 arrays to offer protection. However, VMD is a HW feature of the PCIe lanes and is not a common failure scenario.

Q. How many PCI lanes on the CPU can be used to NVMe drive usage, and have the Intel CPU's enough PCI lanes?

A. All CPU lanes on Intel Xeon Scalable Processors are VMD capable, but the actual lanes available for direct NVMe SSD connection depends on server's motherboard design so it is not the same for all vendors. In general, consider that 50% of PCIe lanes on CPU can be used to connect NVMe SSD drive.

Q. What is the advantage of Intel VMD/VROC over Storage Spaces (which is built in SWRAID solution in Windows)?

A. VROC supports both Linux and Windows and has a pre-OS component to offer bootable RAID.

Q. If I understand correctly, Intel VROC is hybrid raid, does it require any OS utility like mdadm to manage array on linux host?

A. VROC configuration is achieved in many ways, including Intel GUI or CLI tool. In Linux, the mdadm OS utility is used to manage the RAID arrays

Q. Will you go over matrix raid? Curious about that one.

A. Matrix RAID is about multiple RAID levels configurable on common disks, if space is available. Example: (4) Disk RAID 10 at 1TB and RAID 5 using remaining space on same (4) disks.

Q. I see a point saying VROC had better performance…
Are there any VROC Performance Metrics (say 4K RR/RW IOPS and 1M Seq Reads/Write) available with Intel NVMe drives? Any comparison with any SWRAID or HBA RAID Solutions?

A. For comparisons, it is best to refer to specific server vendors since they are not all the same. Some generic performance comparisons can be found an www.intel.com/vroc in the Support and Documentation section at the bottom.

Q. Which Linux Kernel & Windows version supports VROC?

A. VROC has an interoperability matrix posted on the web at this link: https://www.intel.com/content/dam/support/us/en/documents/memory-and-storage/ssd-software/Intel_VROC_Supported_Configs_6-3.pdf

Q. Does VROC support JBOD to be used with software defined storage? Can we create a RAID1 for boot and a jbod for vsan or Microsoft for example?

A. Yes, these use cases are possible.

Q. Which ESXi supports VMD (6.5 or 7 or both)? Any forecast for supporting VROC in future releases?

A. ESXi supports VMD starting at version 6.5U2 and continues forward with 6.7 and 7.0 releases.

Q. Can VMD support VROC with more than 4 drives?

A. VROC can support up to 48 NVMe SSDs per platform

Q. What is the maximum no. of drives supported by a single VMD domain?

A. Today a VMD domain has 16 PCIe lanes so 4 NVMe SSD drives are supported as direct attached per domain. If switches are used 24 NVMe SSDs can be attached to one VMD domain.

Q. Does VROC use any Caching mechanisms either through the Firmware or the OS Driver?

A. No caching in VROC today, considered as a future option.

Q. How does VROC close RAID5 write hole?

A. Intel VROC uses a journaling mechanism to track in flight writes and log them using the Power Loss Imminent feature of the RAID member SSDs. In case of a double fault scenario that could cause a RAID Write Hole corruption scenario, VROC uses these journal logs to prevent any data corruption and rebuild after reboot.

Q. So the RAID is part of VMD? or VMD is only used to led and hot-plug ?

A. VMD is prerequisite to VROC so it is a key element. In simple terms, VROC is the RAID capability, all the rest is VMD.

Q. What does LED mean?

A. Light Emitting Diode

Q. What is the maximum no. of NVMe SSDs that are supported by Intel VROC at a time?

A. That number would be 48 but you need to ask your server vendor since the motherboard needs to be capable of that.

Q. Definition of VMD domain?

A. A VMD domain can be described as a CPU-integrated End-Point to manage PCIe/NVMe SSDs. VMD stands for Volume Management Device.

Q. Does VROC also support esx as bootable device?

A. No, ESXi is not supported by VROC, but VMD is. In future releases, ESXi VMD functionality may add some RAID capabilities.

Q. Which are the Intel CPU Models that supports VMD & VROC?

A. All Intel Xeon Scalable Processors

Q. Is Intel VMD present on all Intel CPUs by default?

A. Intel Xeon Scalable Processors are required. But you also need to have the support on the server's motherboard.

Q. How is Software RAID (which uses system CPU) different than CPU RAID used for NVMe?

A. With software RAID we intended a RAID mechanism that kicks-in after the operating system has booted. Some vendors use the term SW RAID in a different way. CPU RAID for NVMe is a function of the CPU, rather than the OS, and also includes Pre-OS/BIOS/Platform components.

Q. I have been interested in VMD/VROC since it was introduced to me by Intel in 2017 with Intel Scalable Xeon (Purley) and the vendor I worked with then, Huawei, and now Dell Technologies has never adopted it into an offering. Why? What are the implementation impediments, the cost/benefit, and vendor resistance impacting wider adoption?

A. Different server vendors decide what technologies they are willing to support and with which priority. Today multiple server vendors are supporting VROC, but not all of them.

Q. What's the UBER (unrecoverable bit error rate) for NVMe drives? Same as SATA (10^-14), or SAS (10^-16), or other? (since we were comparing them - and it will be important for RAID implementations)

A. UBER is not influenced by VROC at all. In general UBER for SATA SSD is very similar to NVMe SSD.

Q. Can we get some more information or examples of Hybrid RAID. How is it exactly different from SWRAID?

A. In our description, SW RAID requires the OS to be operational before RAID can work. With hybrid RAID, this is not the case. Also, hybrid RAID has a HW component that acts similar to an HBA, in this case that is VMD. SW RAID does not have this isolation.  

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Subscribe to