An FAQ on RAID on the CPU

Paul Talbut

Oct 15, 2020

title of post

A few weeks ago, SNIA EMEA hosted a webcast to introduce the concept of RAID on CPU. The invited experts, Fausto Vaninetti from Cisco, and Igor Konopko from Intel, provided fascinating insights into this exciting new technology.

The webcast created a huge amount of interest and generated a host of follow-up questions which our experts have addressed below. If you missed the live event “RAID on CPU: RAID for NVMe SSDs without a RAID Controller Card” you can watch it on-demand.

Q. Why not RAID 6?

A. RAID on CPU is a new technology. Current support is for the most-used RAID levels for now, considering this is for servers not disk arrays. RAID 5 is primary parity RAID level for NVMe with 1 drive failure due to lower AFRs and faster rebuilds.

Q. Is the XOR for RAID 5 done in Software?

A.Yes, it is done in software on some cores of the Xeon CPU.

Q. Which generation of Intel CPUs support VROC?

A. All Intel Xeon Scalable Processors, starting with Generation 1 and continuing through with new CPU launches support VROC.

Q. How much CPU performance is used by the VROC implementation?

A. It depends on OS, Workload, and RAID Level. In Linux, Intel VROC is a kernel storage stack and not directly tied to specific cores, allowing it to scale based on the IO demand to the storage subsystem. This allows performance to scale as the number of NVMe drives attached increases. Under lighter workloads, Intel VROC and HBAs have similar CPU consumption. Under heavier workloads, the CPU consumption increases for Intel VROC, but so does the performance (IOPS/bandwidth), while the HBA hits a bottleneck (i.e. limited scaling). In Windows, CPU consumption can be higher, and performance does not scale as well due to differences in the storage stack implementation.

Q. Why do we only see VROC on Supermicro & Intel servers? The others don't have technology or have they preferred not to implement it?

A. This is not correct. There are more vendors supporting VROC than Supermicro and Intel itself. For example, Cisco is fully behind this technology and has a key-less implementation across its UCS B and C portfolio. New designs with VROC are typically tied to new CPU/platform launches from Intel, so keep an eye on your preferred platform providers as new platforms are launched.

Q. Are there plans from VROC to have an NVMe Target Implementation to connect external hosts?

A. Yes, VROC can be included in an NVMe-oF target. While not the primary use case for VROC, it will work. We are exploring this with customers to understand gaps and additional features to make VROC a better fit.

Q. Are there limitations for dual CPU configurations or must the VROC be configured for single CPU?

A. VROC can be enabled on dual CPU servers as well as single CPU servers. The consideration to keep in mind is that a RAID volume spanning multiple CPUs could see reduced performance, so it is not recommended if it can be avoided.

Q. I suggest having a key or explaining what X16 PCIe means in diagrams? It does mean the memory, right?

A. PCIe x16 indicates the specific PCIe bus implementation with 16 lanes.

Q. Do you have maximum performance results (IOPS, random read) of VROC on 24 NMVe devices?

A. This webinar presented some performance results. If more is needed, please contact your server vendor. Some additional performance results can be found at www.intel.com/vroc in the Support and Documentation Section at the bottom.

Q. I see "tps" and "ops" and "IOPs" within the presentation.  Are they all the same?  Transactions Per Second = Operations Per Second = I/O operations per second?

A. No, they are not the same. I/O operations are closer to storage concepts, transactions per second are closer to application concepts.

Q. I see the performance of Random Read is not scaling 4 times (2.5M) of pass-thru in case of windows (952K), whereas in Linux it is scaling (2.5M). What could be the reason for such low performance?

A. Due to differences in the way operating systems work, Linux is offering the best performance so far.

Q. Is there an example of using VROC for VMware (ESXi)?

A. VROC RAID is not supported for ESXi, but VMD is supported for robust attachment of NVMe SSDs with hot-plug/LED

Q. How do you protect RAID-1 data integrity if a power loss happened after only one drive is updated?

A. For RAID1 schema, you can read your data when a single drive is written. With RAID5 you need multiple drives available to rebuild your data.

Q. Where can I learn more about the VROC IC option?

A. Contact your server vendor or Intel representatives.

Q. In the last two slides, the MySQL and MongoDB configs, is the OS / boot protected? Is Boot on the SSDs or other drive(s)?

A. In this case boot was on a separate device and was not protected, but only because this was a test server. Intel VROC does support bootable RAID, so RAID1 redundant boot can be applied to the OS. This means on 1 platform, Intel VROC can support RAID1 for boot and other separate data RAID sets.

Q. Does this VROC need separate OS Drivers or do they have inbox support (for both Linux and Windows)?

A. There is inbox support in Linux and to get the latest features, the recommendation remains to use latest available OS releases. In some cases, a Linux OS Driver is provided for older OS releases to backport. In Windows, everything is delivered through OS Driver package.

Q. 1. Is Intel VMD a 'hardware' feature to newer XEON chips?  2. If VMD is software, can it be installed into existing servers?  3. If VMD is on a server today, can VROC be added to an existing server?

A. VMD is a prerequisite for VROC and is a hardware feature of the CPU along with relevant UEFI and OS drivers. VMD is possible on Intel Xeon Scalable Processors, but it also needs to be enabled by the server's motherboard and its firmware. It’s best to talk to your server vendor.

Q. In traditional spinning rust RAID, drive failure is essentially random (chance increases based on power on hours); with SSDs, failure is not mechanical and is ultimately based on lifetime utilization/NAND cells wearing out. How does VROC or RAID on CPU in general handle wear leveling to ensure that a given disk group doesn't experience multiple SSD failures at roughly the same time?

A. In general server vendors have a way to show the wear level for supported SSDs and that can help in this respect.

Q. Any reasons for not using caching on Optane memory instead of Optane SSD?

A. Using Optane Persistent Memory Module is a use case that will be expanded to over time. The current caching implementation requires a block device, so using an Intel SSD was the more direct use case.

Q. Wouldn't the need to add 2x Optane drives negate the economic benefit of VROC vs hardware RAID?

A. It depends on use cases. Clearly there is a cost associated to adding Optane in the mix. In some cases, only 2x 100GB Intel Optane SSDs are needed, which is still economical.

Q. Does VROC require Platinum processor?  Does Gold/Silver processors support VROC?

A. Intel VROC & VMD are supported across the Intel Xeon Scalable Processor product skus (bronze-platinum) as well as other product families such as Intel Xeon-D and Intel Xeon-W.

Q. Which NVMe spec is VROC complying to?

A. NVMe 1.4

Q. WHC is disabled by default. When should it be enabled? After write fault happened? or enabled before IO operation?


A. WHC should be enabled before you start writing data to your volumes. It can be required for critical data where data corrupted cannot be tolerated in any circumstance.

Q. Which vendors offer Intel VROC with their systems?

A. Multiple vendors as of today, but the specifics of implementation, licensing and integrated management options may differ.

Q. Is VROC available today?

A. Yes, launched in 2017

Q. Is there a difference in performance between the processor categories? Platinum, gold and Silver have the same benefits?

A. Different processor categories have performance differences by themself. VROC is not different across those CPUs.

Q. In a dual CPU config and there is an issue with the VMD on one processor, is there any protection?

A. This depends on how the devices are connected. SSDs could be connected to different VMDs and in RAID1 arrays to offer protection. However, VMD is a HW feature of the PCIe lanes and is not a common failure scenario.

Q. How many PCI lanes on the CPU can be used to NVMe drive usage, and have the Intel CPU's enough PCI lanes?

A. All CPU lanes on Intel Xeon Scalable Processors are VMD capable, but the actual lanes available for direct NVMe SSD connection depends on server's motherboard design so it is not the same for all vendors. In general, consider that 50% of PCIe lanes on CPU can be used to connect NVMe SSD drive.

Q. What is the advantage of Intel VMD/VROC over Storage Spaces (which is built in SWRAID solution in Windows)?

A. VROC supports both Linux and Windows and has a pre-OS component to offer bootable RAID.

Q. If I understand correctly, Intel VROC is hybrid raid, does it require any OS utility like mdadm to manage array on linux host?

A. VROC configuration is achieved in many ways, including Intel GUI or CLI tool. In Linux, the mdadm OS utility is used to manage the RAID arrays

Q. Will you go over matrix raid? Curious about that one.

A. Matrix RAID is about multiple RAID levels configurable on common disks, if space is available. Example: (4) Disk RAID 10 at 1TB and RAID 5 using remaining space on same (4) disks.

Q. I see a point saying VROC had better performance…
Are there any VROC Performance Metrics (say 4K RR/RW IOPS and 1M Seq Reads/Write) available with Intel NVMe drives? Any comparison with any SWRAID or HBA RAID Solutions?

A. For comparisons, it is best to refer to specific server vendors since they are not all the same. Some generic performance comparisons can be found an www.intel.com/vroc in the Support and Documentation section at the bottom.

Q. Which Linux Kernel & Windows version supports VROC?

A. VROC has an interoperability matrix posted on the web at this link: https://www.intel.com/content/dam/support/us/en/documents/memory-and-storage/ssd-software/Intel_VROC_Supported_Configs_6-3.pdf

Q. Does VROC support JBOD to be used with software defined storage? Can we create a RAID1 for boot and a jbod for vsan or Microsoft for example?

A. Yes, these use cases are possible.

Q. Which ESXi supports VMD (6.5 or 7 or both)? Any forecast for supporting VROC in future releases?

A. ESXi supports VMD starting at version 6.5U2 and continues forward with 6.7 and 7.0 releases.

Q. Can VMD support VROC with more than 4 drives?

A. VROC can support up to 48 NVMe SSDs per platform

Q. What is the maximum no. of drives supported by a single VMD domain?

A. Today a VMD domain has 16 PCIe lanes so 4 NVMe SSD drives are supported as direct attached per domain. If switches are used 24 NVMe SSDs can be attached to one VMD domain.

Q. Does VROC use any Caching mechanisms either through the Firmware or the OS Driver?

A. No caching in VROC today, considered as a future option.

Q. How does VROC close RAID5 write hole?

A. Intel VROC uses a journaling mechanism to track in flight writes and log them using the Power Loss Imminent feature of the RAID member SSDs. In case of a double fault scenario that could cause a RAID Write Hole corruption scenario, VROC uses these journal logs to prevent any data corruption and rebuild after reboot.

Q. So the RAID is part of VMD? or VMD is only used to led and hot-plug ?

A. VMD is prerequisite to VROC so it is a key element. In simple terms, VROC is the RAID capability, all the rest is VMD.

Q. What does LED mean?

A. Light Emitting Diode

Q. What is the maximum no. of NVMe SSDs that are supported by Intel VROC at a time?

A. That number would be 48 but you need to ask your server vendor since the motherboard needs to be capable of that.

Q. Definition of VMD domain?

A. A VMD domain can be described as a CPU-integrated End-Point to manage PCIe/NVMe SSDs. VMD stands for Volume Management Device.

Q. Does VROC also support esx as bootable device?

A. No, ESXi is not supported by VROC, but VMD is. In future releases, ESXi VMD functionality may add some RAID capabilities.

Q. Which are the Intel CPU Models that supports VMD & VROC?

A. All Intel Xeon Scalable Processors

Q. Is Intel VMD present on all Intel CPUs by default?

A. Intel Xeon Scalable Processors are required. But you also need to have the support on the server's motherboard.

Q. How is Software RAID (which uses system CPU) different than CPU RAID used for NVMe?

A. With software RAID we intended a RAID mechanism that kicks-in after the operating system has booted. Some vendors use the term SW RAID in a different way. CPU RAID for NVMe is a function of the CPU, rather than the OS, and also includes Pre-OS/BIOS/Platform components.

Q. I have been interested in VMD/VROC since it was introduced to me by Intel in 2017 with Intel Scalable Xeon (Purley) and the vendor I worked with then, Huawei, and now Dell Technologies has never adopted it into an offering. Why? What are the implementation impediments, the cost/benefit, and vendor resistance impacting wider adoption?

A. Different server vendors decide what technologies they are willing to support and with which priority. Today multiple server vendors are supporting VROC, but not all of them.

Q. What's the UBER (unrecoverable bit error rate) for NVMe drives? Same as SATA (10^-14), or SAS (10^-16), or other? (since we were comparing them - and it will be important for RAID implementations)

A. UBER is not influenced by VROC at all. In general UBER for SATA SSD is very similar to NVMe SSD.

Q. Can we get some more information or examples of Hybrid RAID. How is it exactly different from SWRAID?

A. In our description, SW RAID requires the OS to be operational before RAID can work. With hybrid RAID, this is not the case. Also, hybrid RAID has a HW component that acts similar to an HBA, in this case that is VMD. SW RAID does not have this isolation.  

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

See You (Online) at SDC!

Marty Foltyn

Sep 15, 2020

title of post
We’re going virtual in 2020, and Compute, Memory, and Storage are important topics at the upcoming SNIA Storage Developer ConferenceSNIA CMSI is a sponsor of SDC 2020 – so visit our booth for the latest information and a chance to chat with our experts.  With over 120 sessions available to watch live during the event and later on-demand, live Birds of a Feather chats, and a Persistent Memory Bootcamp accessing new PM systems in the cloud, we want to make sure you don’t miss anything!  Register here to see sessions live – or on demand to your schedule.  Agenda highlights include: Computational Storage Talks Deploying Computational Storage at the Edge – discussing the deployment of small form factor, asic-based, solutions, including a use case. Next Generation Datacenters require composable architecture enablers and deterministic programmable intelligenceexplaining why determinism, parallel programming and ease of programming are important. Computational Storage Birds of a Feather LIVE Session – ask your questions of our experts and see live demos of computational storage production systems. Tuesday September 22, 2020 – 3:00 pm – 4:00 pm PDT (UTC-7) Persistent Memory Presentations Caching on PMEM: an Iterative Approachdiscussing Twitter’s approach to exploring in-memory caching. Challenges and Opportunities as Persistence Moves Up the Memory/Storage Hierarchy – show how and why memory at all levels will become persistent. Persistent Memory on eADR System – describes how the SNIA Persistent Memory Programming Model will include the possibility of platforms where the CPU caches are considered permanent and need no flushing. Persistent Memory Birds of a Feather LIVE Sessionask your questions to our experts on your bootcamp progress, how to program PM, or what PM is shipping today . Tuesday, September 22, 2020 – 4:00 pm – 5:00 pm PDT (UTC-7) Solid State Storage Sessions Enabling Ethernet Drives – provides a glimpse into a new SNIA standard that enables SSDs to have an Ethernet interface, and discussed the latest management standards for NVMe-oF drives. An SSD for Automotive Applications – details efforts under way in JEDEC to define a new Automotive SSD standard.

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Are We at the End of the 2.5-inch Disk Era?

Jonmichael Hands

Jul 20, 2020

title of post

The SNIA Solid State Storage Special Interest Group (SIG) recently updated the Solid State Drive Form Factor page to provide detailed information on dimensions; mechanical, electrical, and connector specifications; and protocols. On our August 4, 2020 SNIA webcast, we will take a detailed look at one of these form factors - Enterprise and Data Center SSD Form Factor (EDSFF) – challenging an expert panel to consider if we are at the end of the 2.5-in disk era.

Enterprise and Data Center Form Factor (EFSFF) is designed natively for data center NVMe SSDs to improve thermal, power, performance, and capacity scaling. EDSFF has different variants for flexible and scalable performance, dense storage configurations, general purpose servers, and improved data center TCO.  At the 2020 Open Compute Virtual Summit, OEMs, cloud service providers, hyperscale data center, and SSD vendors showcased products and their vision for how this new family of SSD form factors solves real data challenges.

During the webcast, our SNIA experts from companies that have been involved in EDSFF since the beginning will discuss how they will use the EDSFF form factor:

  • Hyperscale data center and cloud service provider panelists Facebook and Microsoft will discuss how E1.S (SNIA specification SFF-TA-1006) helps solve performance scalability, serviceability, capacity, and thermal challenges for future NVMe SSDs and persistent memory in 1U servers.
  • Server and storage system panelists Dell, HPE, Kioxia, and Lenovo will discuss their goals for the E3 family and the new updated version of the E3 specification (SNIA specification SFF-TA-1008)

We hope you can join us as we spend some time on this important topic.  Register here to save your spot.

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Are We at the End of the 2.5-inch Disk Era?

Jonmichael Hands

Jul 20, 2020

title of post
The SNIA Solid State Storage Special Interest Group (SIG) recently updated the Solid State Drive Form Factor page to provide detailed information on dimensions; mechanical, electrical, and connector specifications; and protocols. On our August 4, 2020 SNIA webcast, we will take a detailed look at one of these form factors – Enterprise and Data Center SSD Form Factor (EDSFF) – challenging an expert panel to consider if we are at the end of the 2.5-in disk era. Enterprise and Data Center Form Factor (EFSFF) is designed natively for data center NVMe SSDs to improve thermal, power, performance, and capacity scaling. EDSFF has different variants for flexible and scalable performance, dense storage configurations, general purpose servers, and improved data center TCO.  At the 2020 Open Compute Virtual Summit, OEMs, cloud service providers, hyperscale data center, and SSD vendors showcased products and their vision for how this new family of SSD form factors solves real data challenges. During the webcast, our SNIA experts from companies that have been involved in EDSFF since the beginning will discuss how they will use the EDSFF form factor:
  • Hyperscale data center and cloud service provider panelists Facebook and Microsoft will discuss how E1.S (SNIA specification SFF-TA-1006) helps solve performance scalability, serviceability, capacity, and thermal challenges for future NVMe SSDs and persistent memory in 1U servers.
  • Server and storage system panelists Dell, HPE, Kioxia, and Lenovo will discuss their goals for the E3 family and the new updated version of the E3 specification (SNIA specification SFF-TA-1008)
We hope you can join us as we spend some time on this important topic.  Register here to save your spot.

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Your Questions Answered on CMSI and More

Marty Foltyn

Jun 29, 2020

title of post

The “new” SNIA Compute, Memory, and Storage Initiative (CMSI) was formed at the beginning of 2020 out of the SNIA Solid State Storage Initiative.  The 45 companies who comprise the CMSI recognized the opportunity to combine storage, memory, and compute in new, novel, and useful ways; and to bring together technology, alliances, education, and outreach to better understand new opportunities and applications. 

To better explain this decision, and to talk about the various aspects of the Initiative, CMSI co-chair Alex McDonald invited CMSI members Eli Tiomkin, Jonmichael Hands, and Jim Fister to join him in a live SNIA webcast. 

If you missed the live webcast, we
encourage you to watch
it on demand
as it was highly rated by attendees. Our panelists answered
questions on computational storage, persistent memory, and solid state storage during
the live event; here are answers to those and to ones we did not have time to
get to.

Q1: In terms of the
overall definition of Computational Storage, how does Computational Storage and
the older Composable Storage terms interact?  Are they the same?  Are
SNIA and their Computational Storage Technical Work Group (CS TWG) working on expanding
computational storage uses?

A1: Some of the
definitions that the CS TWG explores range from single use storage functions —
such as compression — or multiple services running in a more complex and
programmable environment. The latter encompasses more of the thoughts
around composable storage.  So compostable storage is a
part of computational storage, and will continue to be incorporated as the
definitions and programming models are developed.  If you like to see the
latest working document on the computational storage model, a draft can be
found here.

Q2: In terms of some
of the definitions of drive form factors, are the naming conventions completed?

A2: There are still
opportunities to change definitions and naming.  The work group is continuing
to work on naming conventions for the latest specifications.

A2a: If you’d like to
hear a great dialog on Alex’s thoughts on naming conventions followed by Jim’s
notes on lunch menus, tune in at minute 48 of the webcast.

Q3: Is the E3 drive
specification backward compatible with the existing E1.L or E1.S?

A3: The connector is
the same, but the speeds are different.  Existing testing infrastructure
should work to test the drives.  On a mainstream server, E3 is meant to be
used in a backplane, which the prior standards would fit in either an
orthogonal connector or a backplane.  So the two are sometimes compatible.

Q4: Will E1.S be a
alternative for workstation class laptops, replacing M.2?  So would it be
useful for higher capacity drives?

A4: M.2 is the mainstream form factor for laptops, desktops, and workstations.  But the low power profile (8W) can limit drive performance.  E1.S has a 25W specified, and may be much more effective for higher-end workstations and desktops.  Both specifications are likely to remain in volume. You can check out the various SSD specs on our Solid State Drive Form Factor page.

Q5: Does SNIA use too
many S’s in its acronyms?

A5: Alex MacDonald
thinks so.

Q6: Can we talk more
about computational and composable storage?

A6: Alex MacDonald
gave the order for a detailed future SNIA CMSI webcast.  Stay tuned!

Q7: Is there a PMDK
port for Oracle Solaris?

A7: Currently no, but
someone should submit a pull request at PMDK.org, and the magic geeks might
work their powers.  Given the close similarities, there is a distinct
possibility that it can be done.

Q8: Does deduplication technology
come into play with computational storage?

A8: Not immediately.
These are mostly fixed functions right now, available on many drives.  If
it becomes an accelerator function, then that would be incorporated.

Q9: Is there that much
difference between how software should handle magnetic drives, NVMe drives, and
Persistent Memory?

A9: Yes.  Any
other questions?

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

25 Questions (and Answers) on Ethernet-attached SSDs

Ted Vojnovich

Apr 14, 2020

title of post
The SNIA Networking Storage Forum celebrated St. Patrick’s Day by hosting a live webcast, “Ethernet-attached SSDs – Brilliant Idea or Storage Silliness?” Even though we didn’t serve green beer during the event, the response was impressive with hundreds of live attendees who asked many great questions – 25 to be exact. Our expert presenters have answered them all here: Q. Has a prototype drive been built today that includes the Ethernet controller inside the NVMe SSD? A. There is an Interposing board that extends the length by a small amount. Integrated functionality will come with volume and a business case. Some SSD vendors have plans to offer SSDs with fully-integrated Ethernet controllers. Q. Costs seem to be the initial concern… true apples to apples between JBOF? A. Difference is between a PCIe switch and an Ethernet switch. Ethernet switches usually cost more but provide more bandwidth than PCIe switches. An EBOF might cost more than a JBOF with the same number of SSDs and same capacity, but the EBOF is likely to provide more performance than the JBOF. Q. What are the specification names and numbers. Which standards groups are involved? A. The Native NVMe-oF Drive Specification from the SNIA is the primary specification. A public review version is here. Within that specification, multiple other standards are referenced from SFF, NVMe, and DMTF. Q. How is this different than “Kenetic”, “Object Storage”, etc. effort a few years ago? Is there any true production quality open source available or planned (if so when), if so, by whom and where? A. Kinetic drives were hard disks and thus did not need high speed Ethernet. In fact, new lower-speed Ethernet was developed for this case. The pins chosen for Kinetic would not accommodate the higher Ethernet speeds that SSDs need, so the new standard re-uses the same lanes defined for PCIe for use by Ethernet. Kinetic was a brand-new protocol and application interface rather than leveraging an existing standard interface such as NVMe-oF. Q. Can OpenChannel SSDs be used as EBOF? A. To the extent that Open Channel can work over NVMe-oF it should work. Q. Define the Signal Integrity challenges of routing Ethernet at these speeds compared to PCIe. A. The signal integrity of the SFF 8639 connector is considered good through 25Gb Ethernet. The SFF 1002 connector has been tested to 50Gb speeds with good signal integrity and may go higher. Ethernet is able to carry data with good signal integrity much farther than a PCIe connection of similar speed. Q. Is there a way to expose Intel Optane DC Persistent Memory through NVMe-oF? A. For now, it would need to be a block-based NVMe device. Byte addressability might be available in the future Q. Will there be interposer to send the Block IO directly over the Switch? A. For the Ethernet Drive itself, there is a dongle available for standard PCIe SSDs to become an Ethernet Drive that supports block IO over NVMe-oF. Q. Do NVMe drives fail? Where is HA implemented? I never saw VROC from Intel adopted. So, does the user add latency when adding their own HA? A. Drive reliability is not impacted by the fact that it uses Ethernet. HA can be implemented by dual port versions of Ethernet drives. Dual port dongles are available today. For host or network-based data protection, the fact that Ethernet Drives can act as a secondary location for multiple hosts, makes data protection easier. Q. Ethernet is a contention protocol and TCP has overhead to deliver reliability.  Is there any work going on to package something like Fibre Channel/QUIC or other solutions to eliminate the downsides of Ethernet and TCP? A. FC-NVMe has been approved as a standard since 2017 and is available and maturing as a solution. NVMe-oF on Ethernet can run on RoCE or TCP with the option to use lossless Ethernet and/or congestion management to reduce contention, or to use accelerator NICs to reduce TCP overhead. QUIC is growing in popularity for web traffic but it’s not clear yet if QUIC will prove popular for storage traffic. Q. Is Lenovo or other OEM’s building standard EBOF storage servers? Is OCP having a work group on EBOF supporting hardware architecture and specification? A. Currently, Lenovo does not offer an EBOF.  However, many ODMs are offering JBOFs and a few are offering EBOFs. OCP is currently focusing on NVMe SSD specifics, including form factor.  While several JBOFs have been introduced into OCP, we are not aware of an OCP EBOF specification per se. There are OCP initiatives to optimize the form factors of SSDs and there are also OCP storage designs for JBOF that could probably evolve into an Ethernet SSD enclosure with minimal changes. Q. Is this an accurate statement on SAS latency. Where are you getting and quoting your data? A. SAS is a transaction model, meaning the preceding transaction must complete before the next transaction can be started (QD does ameliorate this to some degree but end points still have to wait). With the initiator and target having to wait for the steps to complete, overall throughput slows. SAS HDD = milliseconds per IO (governed by seek and rotation); SAS SSD = 100s of microseconds (governed by transaction nature); NVMe SSD = 10s of microseconds (governed by queuing paradigm). Q. Regarding performance & scaling, a 50GbE has less bandwidth than a PCIe Gen3 x4 connection. How is converting to Ethernet helping performance of the array? Doesn’t it face the same bottleneck of the NICs connecting the JBOF/eBOF to the rest of the network? A. It eliminates the JBOF’s CPU and NIC(s) from the data path and replaces them with an Ethernet switch. Math:  1P 50G = 5GBps  1P 4X Gen 3 = 4 GBps,. because PCIe Gen 3 =  8 Gbps per lane so a single 25GbE NIC is usually connected to 4 lanes of PCIe Gen3 and a single 50GbE NIC is usually connected to 8 lanes of PCIe Gen3 (or 4 lanes of PCIe Gen4). But that is half of the story:  2 other dimensions to consider. First, getting all this BW (either way) out the JBOF vs. an EBOF. Second, at the solution level, all these ports (connectivity) and scaling (bandwidth) present their own challenges Q. What about Persistent Memory? Can you present Optane DC through NVMe-Of? A. Interesting idea!!! Today persistent memory DIMMs sit on memory bus so they would not benefit directly from Ethernet architecture. But with the advent of CXL and PCIe Gen 5, there may be a place for persistent memory in “bays” for a more NUMA-like architecture Q. For those of us that use Ceph, this might be an interesting vertical integration, but feels like there’s more to the latency to “finding” and “balancing” the data on the arrays of Ethernet-attached NVMe. Has there been any software suites to accompany this hardware changes and are whitepapers published? A. Ceph nodes are generally slower (for like-to-like HW than non-Ceph storage solutions, so Ceph might be less likely to benefit from Ethernet SSDs, especially NVMe-oF SSDs. That said, if the cost model for ESSDs works out (really cheap Ceph nodes to overcome “throwing HW at the problem”), one could look at Ceph solutions using ESSDs, either via NVMe-oF or by creating ESSDs with a key-value interface that can be accessed directly by Ceph. Q. Can the traditional Array functions be moved to the LAN switch layer, either included in the switch (~the Cisco MDS and IBM SVC “”experiment””) or connect the controller functionality to the LAN switch backbone with the SSD’s in a separate VLAN? A. Many storage functions are software/firmware driven. Certainly, a LAN switch with a rich X86 complex could do this…or…a server with a switch subsystem could. I can see low level storage functions (RAID XOR, compression, maybe snapshots) translated to switch HW, but I don’t see a clear path for high level functions (dedupe, replication, etc) translated to switch HW.  However, since hyperscale does not perform many high-level storage functions at the storage node, perhaps enough can be moved to switch HW over time. Q. ATA over Ethernet has been working for nearly 18 years now. What is the difference? A. ATA over Ethernet is more of a work group concept and has never gone mainstream (to be honest, your question is the first time I heard this since 2001). In any event, ATA does not take advantage of queuing nature of NVMe so it’s still held hostage by transaction latency.  Also, no high availability (HA) in ATA (at least I am not aware of any HA standards for ATA), which presents a challenge because HA at the box or storage controller level does NOT solve the SPOF problem at the drive level. Q. Request for comment – Ethernet 10G, 25G, 50G, 100G per lane (all available today), and Ethernet MAC speeds of 10G, 25G, 40G, 50G, 100G, 200G, 400G (all available today), Ethernet is far more scalable compared to PCIe.  Comparing Ethernet Switch relative cost to PCIe switch, Ethernet Switch is far more economical.  Why shouldn’t we switch? A. Yes Ethernet is more scalable than PCIe, but 3 things need to happen. 1) Solution level orchestration has to happen (putting an EBOF behind an RBOF is okay but only the first step);  2) The Ethernet world has to start understanding how storage works (multipathing, ACLs, baseband drive management, etc.);  3) Lower cost needs to be proven–jury still out on cost (on paper, it’s a no brainer, but costs of the Ethernet switch in the I/O Module can rival an X86 complex). Note that Ethernet with 100Gb/s per lane is not yet broadly available as of Q2 2020. Q. We’ve seen issues with single network infrastructure from an availability perspective. Why would anyone put their business at risk in this manner? Second question is how will this work with multiple vendor hosts or drive vendors, each having different specifications? A. Customers already connect their traditional storage arrays to either single or dual fabrics, depending on their need for redundancy, and an Ethernet drive can do the same, so there is no rule that an Ethernet SSD must rely on a single network infrastructure. Some large cloud customers use data protection and recovery at the application level that spans multiple drives (or multiple EBOFS), providing high levels of data availability without needing dual fabric connections to every JBOF or to every Ethernet drive. For the second part of the question, it seems likely that all the Ethernet drives will support a standard Ethernet interface and most of them will support the NVMe-oF standard, so multiple host and drive vendors will interoperate using the same specifications. This is already been happening through UNH plug fests at the NIC/Switch level. Areas where Ethernet SSDs might use different specifications might include a key-value or object interface, computational storage APIs, and management tools (if the host or drive maker don’t follow one of the emerging SNIA specifications). Q. Will there be a Plugfest or certification test for Ethernet SSDs? A. Those Ethernet SSDs that use the NVMe-oF interface will be able to join the existing UNH IOL plugfests for NVMe-oF. Whether there are plugfests for any other aspects of Ethernet SSDs–such as key-value or computational storage APIs–likely depends on how many customers want to use those aspects and how many SSD vendors support them. Q. Do you anticipate any issues with mixing control (Redfish/Swordfish) and data over the same ports? A. No, it should be fine to run control and data over the same Ethernet ports. The only reason to run management outside of the data connection would be to diagnose or power cycle an SSD that is still alive but not responding on its Ethernet interface. If out-of-band management of power connections is required, it could be done with a separate management Ethernet connection to the EBOF enclosure. Q. We will require more Switch ports would it mean more investment to be spent Also how is the management of Ethernet SSD’s done. A. Deploying Ethernet SSDs will require more Ethernet switch ports, though it will likely decrease the needed number of other switch or repeater ports (PCIe, SAS, Fibre Channel, InfiniBand, etc.). Also, there are models showing that Ethernet SSDs have certain cost advantages over traditional storage arrays even after including the cost of the additional Ethernet switch ports. Management of the Ethernet SSDs can be done via standard Ethernet mechanisms (such as SNMP), through NVMe commands (for NVMe-oF SSDs), and through the evolving DTMF Redfish/SNIA Swordfish management frameworks mentioned by Mark Carlson during the webcast. You can find more information on SNIA Swordfish here. Q. Is it assumed that Ethernet connected SSDs need to implement/support congestion control management, especially for cases of overprovision in EBOF (i.e. EBOF bandwidth is less than sum of the underlying SSDs under it)? If so – is that standardized? A. Yes, but both NVMe/TCP and NVMe/RoCE protocols have congestion management as part of the protocol, so it is baked in. The eSSDs can connect to either a switch inside the EBOF enclosure or to an external Top-of-Rack (ToR) switch. That Ethernet switch may or may not be oversubscribed, but either way the protocol-based congestion management on the individual Ethernet SSDs will kick in if needed. But if the application does not access all the eSSDs in the enclosure at the same time, the aggregate throughput from the SSDs being used might not exceed the throughput of the switch. If most or all of the SSDs in the enclosure will be accessed simultaneously, then it could make sense to use a non-blocking switch (that will not be oversubscribed) or rely on the protocol congestion management. Q. Are the industry/standards groups developing application protocol (IOS layers 5 thru 7) to allow customers to use existing OS/Apps without modification? If so when will these be available and via what delivery to the market such as new IETF Application Protocol, Consortium,…? A.  Applications that can directly use individual SSDs can access a NVMe-oF Ethernet SSD directly as block storage, without modification and without using any other protocols. There are also software-defined storage solutions that already manage and virtualize access to NVMe-oF arrays and they could be modified to allow applications to access multiple Ethernet SSDs without modifications to the applications. At higher levels of the IOS stack, the computational storage standard under development within SNIA or a key-value storage API could be other solutions to allow applications to access Ethernet SSDs, though in some cases the applications might need to be modified to support the new computational storage and/or key-value APIs. Q. In an eSSD implementation what system element implements advanced features like data streaming and IO determinism? Maybe a better question is does the standard support this at the drive level? A. Any features such as these that are already part of NVMe will work on Ethernet drives.

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Are Ethernet-attached SSDs Brilliant?

Ted Vojnovich

Feb 12, 2020

title of post
Several solid state disk (SSD) and networking vendors have demonstrated ways to connect SSDs directly to an Ethernet network. They propose that deploying Ethernet SSDs will be more scalable, easier to manage, higher performance, and/or lower cost than traditional storage networking solutions that use a storage controller (or hyperconverged node) between the SSDs and the network. Who would want to attach SSDs directly to the network? Are these vendors brilliant or simply trying to solve a problem that doesn’t exist? What are the different solutions that could benefit from Ethernet SSDs? Which protocols would one use to access them? How will orchestration be used to enable applications to find assigned Ethernet SSDs? How will Ethernet SSDs affect server subsystems such as Ethernet RAID/mirroring and affect solution management such as Ethernet SAN orchestration?  And how do Ethernet SSDs relate to computational storage? Find out on March 17, 2020 when the SNIA Ethernet Storage Forum presents a live webcast, “Ethernet-attached SSDs—Brilliant Idea or Storage Silliness? In this webcast, SNIA experts will discuss:
  • Appropriate use cases for Ethernet SSDs
  • Why Ethernet SSDs could be cost-effective and efficient
  • How Ethernet SSDs compare to other forms of storage networking
  • Different ways Ethernet SSDs can be accessed, such as JBOF/NBOF, NVMe-oF, and Key Value
  • How Ethernet-attached SSDs enable composable infrastructures
Register now for what is sure to be an interesting discussion and debate on this technology.

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Hyperscalers Take on NVMe™ Cloud Storage Questions

J Metz

Dec 2, 2019

title of post
Our recent webcast on how Hyperscalers, Facebook and Microsoft are working together to merge their SSD drive requirements generated a lot of interesting questions. If you missed “How Facebook & Microsoft Leverage NVMe Cloud Storage” you can watch it on-demand. As promised at our live event. Here are answers to the questions we received. Q. How does Facebook or Microsoft see Zoned Name Spaces being used? A. Zoned Name Spaces are how we will consume QLC NAND broadly. The ability to write to the NAND sequentially in large increments that lay out nicely on the media allows for very little write amplification in the device. Q. How high a priority is firmware malware? Are there automated & remote management methods for detection and fixing at scale? A. Security in the data center is one of the highest priorities. There are tools to monitor and manage the fleet including firmware checking and updating. Q. If I understood correctly, the need for NVMe rooted from the need of communicating at faster speeds with different components in the network. Currently, at which speed is NVMe going to see no more benefit with higher speed because of the latencies in individual components? Which component is most gating/concerning at this point? A. In today’s SSDs, the NAND latency dominates. This can be mitigated by adding backend channels to the controller and optimization of data placement across the media. There are applications that are direct connect to the CPU where performance scales very well with PCIe lane speeds and do not have to deal with network latencies. Q. Where does zipline fit? Does Microsoft expect Azure to default to zipline at both ends of the Azure network? A. Microsoft has donated the RTL for the Zipline compression ASIC to Open Compute so that multiple endpoints can take advantage of “bump in the wire” inline compression. Q. What other protocols exist that are competing with NVMe? What are the pros and cons for these to be successful? A. SATA and SAS are the legacy protocols that NVMe was designed to replace. These protocols still have their place in HDD deployments. Q. Where do you see U.2 form factor for NVMe? A. Many enterprise solutions use U.2 in their 2U offerings. Hyperscale servers are mostly focused on 1U server form factors were the compact heights of E1.S and E1.L allow for vertical placement on the front of the server. Q. Is E1.L form factor too big (32 drives) for failure domain in a single node as a storage target? A. E1.L allows for very high density storage. The storage application must take into account the possibility of device failure via redundancy (mirroring, erasure coding, etc.) and rapid rebuild. In the future, the ability for the SSD to slowly lose capacity over time will be required. Q. What has been the biggest pain points in using NVMe SSD – since inception/adoption, especially, since Microsoft and Facebook started using this. A. As discussed in the live Q&A, in the early days of NVMe the lack of standard drives for both Windows and Linux hampered adoption. This has since been resolved with standard in box drive offerings. Q. Has FB or Microsoft considered allowing drives to lose data if they lose power on an edge server? if the server is rebuilt on a power down this can reduce SSD costs. A. There are certainly interesting use cases where Power Loss Protection is not needed. Q. Do zoned namespaces makes Denali spec obsolete or dropped by Microsoft? How does it impact/compete open channel initiatives by Facebook? A. Zoned Name Spaces incorporates probably 75% of the Denali functionality in an NVMe standardized way. Q. How stable is NVMe PCIe hot plug devices (unmanaged hot plug)? A. Quite stable. Q. How do you see Ethernet SSDs impacting cloud storage adoption?

A. Not clear yet if Ethernet is the right connection mechanism for storage disaggregation. CXL is becoming interesting.

Q. Thoughts on E3? What problems are being solved with E3? A. E3 is meant more for 2U servers. Q. ZNS has a lot of QoS implications as we load up so many dies on E1.L FF. Given the challenge how does ZNS address the performance requirements from regular cloud requirements? A. With QLC, the end to end systems need to be designed to meet the application’s requirements. This is not limited to the ZNS device itself, but needs to take into account the entire system. If you’re looking for more resources on any of the topics addressed in this blog, check out the SNIA Educational Library where you’ll find over 2,000 vendor-neutral presentations, white papers, videos, technical specifications, webcasts and more.

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Hyperscalers Take on NVMe™ Cloud Storage Questions

J Metz

Dec 2, 2019

title of post

Our recent webcast on how Hyperscalers, Facebook and Microsoft are working together to merge their SSD drive requirements generated a lot of interesting questions. If you missed "How Facebook & Microsoft Leverage NVMe Cloud Storage" you can watch it on-demand. As promised at our live event. Here are answers to the questions we received.

Q. How does Facebook or Microsoft see Zoned Name Spaces being used?

A. Zoned Name Spaces are how we will consume QLC NAND broadly. The ability to write to the NAND sequentially in large increments that lay out nicely on the media allows for very little write amplification in the device.

Q. How high a priority is firmware malware? Are there automated & remote management methods for detection and fixing at scale?

A. Security in the data center is one of the highest priorities. There are tools to monitor and manage the fleet including firmware checking and updating.

Q. If I understood correctly, the need for NVMe rooted from the need of communicating at faster speeds with different components in the network. Currently, at which speed is NVMe going to see no more benefit with higher speed because of the latencies in individual components? Which component is most gating/concerning at this point?

A. In today's SSDs, the NAND latency dominates. This can be mitigated by adding backend channels to the controller and optimization of data placement across the media. There are applications that are direct connect to the CPU where performance scales very well with PCIe lane speeds and do not have to deal with network latencies.

Q. Where does zipline fit? Does Microsoft expect Azure to default to zipline at both ends of the Azure network?

A. Microsoft has donated the RTL for the Zipline compression ASIC to Open Compute so that multiple endpoints can take advantage of "bump in the wire" inline compression.

Q. What other protocols exist that are competing with NVMe? What are the pros and cons for these to be successful?

A. SATA and SAS are the legacy protocols that NVMe was designed to replace. These protocols still have their place in HDD deployments.

Q. Where do you see U.2 form factor for NVMe?

A. Many enterprise solutions use U.2 in their 2U offerings. Hyperscale servers are mostly focused on 1U server form factors were the compact heights of E1.S and E1.L allow for vertical placement on the front of the server.

Q. Is E1.L form factor too big (32 drives) for failure domain in a single node as a storage target?

A. E1.L allows for very high density storage. The storage application must take into account the possibility of device failure via redundancy (mirroring, erasure coding, etc.) and rapid rebuild. In the future, the ability for the SSD to slowly lose capacity over time will be required.

Q. What has been the biggest pain points in using NVMe SSD - since inception/adoption, especially, since Microsoft and Facebook started using this.

A. As discussed in the live Q&A, in the early days of NVMe the lack of standard drives for both Windows and Linux hampered adoption. This has since been resolved with standard in box drive offerings.

Q. Has FB or Microsoft considered allowing drives to lose data if they lose power on an edge server? if the server is rebuilt on a power down this can reduce SSD costs.

A. There are certainly interesting use cases where Power Loss Protection is not needed.

Q. Do zoned namespaces makes Denali spec obsolete or dropped by Microsoft? How does it impact/compete open channel initiatives by Facebook?

A. Zoned Name Spaces incorporates probably 75% of the Denali functionality in an NVMe standardized way.

Q. How stable is NVMe PCIe hot plug devices (unmanaged hot plug)?

A. Quite stable.

Q. How do you see Ethernet SSDs impacting cloud storage adoption?

A. Not clear yet if Ethernet is the right connection mechanism for storage disaggregation.  CXL is becoming interesting.

Q. Thoughts on E3? What problems are being solved with E3?

A. E3 is meant more for 2U servers.

Q. ZNS has a lot of QoS implications as we load up so many dies on E1.L FF. Given the challenge how does ZNS address the performance requirements from regular cloud requirements?

A. With QLC, the end to end systems need to be designed to meet the application's requirements. This is not limited to the ZNS device itself, but needs to take into account the entire system.

If you're looking for more resources on any of the topics addressed in this blog, check out the SNIA Educational Library where you'll find over 2,000 vendor-neutral presentations, white papers, videos, technical specifications, webcasts and more.  

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

The Blurred Lines of Memory and Storage – A Q&A

John Kim

Jul 22, 2019

title of post
The lines are blurring as new memory technologies are challenging the way we build and use storage to meet application demands. That’s why the SNIA Networking Storage Forum (NSF) hosted a “Memory Pod” webcast is our series, “Everything You Wanted to Know about Storage, but were too Proud to Ask.” If you missed it, you can watch it on-demand here along with the presentation slides. We promised Q. Do tools exist to do secure data overwrite for security purposes? A. Most popular tools are cryptographic signing of the data where you can effectively erase the data by throwing away the keys. There are a number of technologies available; for example, the usual ones like BitLocker (part of Windows 10, for example) where the NVDIMM-P is tied to a specific motherboard. There are others where the data is encrypted as it is moved from NVDIMM DRAM to flash for the NVDIMM-N type. Other forms of persistent memory may offer their own solutions. SNIA is working on a security model for persistent memory, and there is a presentation on our work here. Q. Do you need to do any modification on OS or application to support Direct Access (DAX)? A. No, DAX is a feature of the OS (both Windows and Linux support it). DAX enables direct access to files stored in persistent memory or on a block device. Without DAX support in a file system, the page cache is generally used to buffer reads and writes to files, and DAX avoids that extra copy operation by performing reads and writes directly to the storage device. Q. What is the holdup on finalizing the NVDIMM-P standard? Timeline? A. The DDR5 NVDIMM-P standard is under development. Q. Do you have a webcast on persistent memory (PM) hardware too? A. Yes. The snia.org website has an educational library with over 2,000 educational assets. You can search for material on any storage-related topic. For instance, a search on persistent memory will get you all the presentations about persistent memory. Q. Must persistent memory have Data Loss Protection (DLP) A. Since it’s persistent, then the kind of DLP is the kind relevant for other classes of storage. This presentation on the SNIA Persistent Memory Security Threat Model covers some of this. Q. Traditional SSDs are subject to “long tail” latencies, especially as SSDs fill and writes must be preceded by erasures. Is this “long-tail” issue reduced or avoided in persistent memory? A. As PM is byte addressable and doesn’t require large block erasures, the flash kind of long tail latencies will be avoided. However, there are a number of proposed technologies for PM, and the read and write latencies and any possible long tail “stutters” will depend on their characteristics. Q. Does PM have any Write Amplification Factor (WAF) issues similar to SSDs? A. The write amplification (WA) associated with non-volatile memory (NVM) technologies comes from two sources.
  1. When the NVM material cannot be modified in place but requires some type of “erase before write” mechanism where the erasure domain (in bytes) is larger than the writes from the host to that domain.
  2. When the atomic unit of data placement on the NVM is larger than the size of incoming writes. Note the term used to denote this atomic unit can differ but is often referred to as a page or sector.
NVM technologies like the NAND used in SSDs suffer from both sources 1 and 2. This leads to very high write amplification under certain workloads, the worst being small random writes. It can also require over provisioning; that is, requiring more NVM internally than is exposed to the user externally. Persistent memory technologies (for example Intel’s 3DXpoint) only suffer from source 2 and can in theory suffer WA when the writes are small. The severity of the write amplification is dependent on how the memory controller interacts with the media. For example, current PM technologies are generally accessed over a DDR-4 channel by an x86 processor. x86 processors send 64 bytes at a time down to a memory controller, and can send more in certain cases (e.g. interleaving, multiple channel parallel writes, etc.). This makes it far more complex to account for WA than a simplistic random byte write model or in comparison with writing to a block device. Q. Persistent memory can provide faster access in comparison to NAND FLASH, but the cost is more for persistent memory. What do you think on the usability for this technology in future? A. Very good. See this presentation “MRAM, XPoint, ReRAM PM Fuel to Propel Tomorrow’s Computing Advances” by analysts, Tom Coughlin and Jim Handy for an in-depth treatment. Q. Does PM have a ‘lifespan’ similar to SSDs (e.g. 3 years with heavy writes, 5 years)? A. Yes, but that will vary by device technology and manufacturer. We expect the endurance to be very high; comparable or better than the best of flash technologies. Q. What is the performance difference between fast SSD vs “PM as DAX?” A. As you might expect us to say; it depends. PM via DAX is meant as a bridge to using PM natively, but you might expect to have improved performance from PM over NVMe as compared with a flash based SSD, as the latency of PM is much lower than flash; micro-seconds as opposed to low milliseconds. Q. Does DAX work the same as SSDs? A. No, but it is similar. DAX enables efficient block operations on PM similar to block operations on an SSD. Q. Do we have any security challenges with PME? A. Yes, and JEDEC is addressing them. Also see the Security Threat Model presentation here. Q. On the presentation slide of what is or is not persistent memory, are you saying that in order for something to be PM it must follow the SNIA persistent memory programming model? If it doesn’t follow that model, what is it? A. No, the model is a way of consuming this new technology. PM is anything that looks like memory (it is byte addressable via CPU load and store operations) and is persistent (it doesn’t require any external power source to retain information). Q. DRAM is basically a capacitor. Without power, the capacitor discharges and so the data is volatile. What exactly is persistent memory? Does it store data inside DRAM or it will use FLASH to store data? A. The presentation discusses two types of NVDIMM; one is based on DRAM and a flash backup that provides the persistence (that is NVDIMM-N), and the other is based on PM technologies (that is NVDIMM-P) that are themselves persistent, unlike DRAM. Q. Slide 15: If Persistent memory is fast and can appear as byte-addressable memory to applications, why bother with PM needing to be block addressed like disks? A. Because it’s going to be much easier to support applications from day one if PM can be consumed like very fast disks. Eventually, we expect PM to be consumed directly by applications, but that will require them to be upgraded to take advantage of it. Q. Can you please elaborate on byte and block addressable? A. Block addressable is the way we do I/O; that is, data is read and written in large blocks of data, typically 4Kbytes in size. Disk interfaces like SCSI or NVMe take commands to read and write these blocks of data to the external device by transferring the data to and from CPU memory, normally DRAM. Byte addressable means that we’re not doing any I/O at all; the CPU instructions for loading & storing fast registers from memory are used directly on PM. This removes an entire software stack to do the I/O, and means we can efficiently work on much smaller units of data; down to the byte as opposed to the fixed 4Kb demanded by I/O interfaces. You can learn more in our presentation “File vs. Block vs. Object Storage.” There are now 10 installments of the “Too Proud to Ask” webcast series, covering these topics: If you have an idea for an “Everything You Wanted to Know about Storage, but were too Proud to Ask” presentation, please let comment on this blog and the NSF team will put it up for consideration.

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Subscribe to Solid State Storage