Blog

An FAQ on QUIC

An FAQ on QUIC

Apr 27, 2020

The SNIA Networking Storage Forum’s recent live webcast “QUIC – Will It
Replace TCP/IP” was a fascinating presentation that was both highly-rated and well-attended. Lars Eggert, technical director of networking at NetApp and current chair of the IETF working group that is delivering this new Internet protocol, explained the history of the protocol, how it is being adopted today, and what the future of QUIC deployment is likely to be. The session generated numerous questions. Here are answers to both the ones Lars had time to answer during the live event as well as those we didn’t get to.

Q. Is QUIC appropriate/targeted to non-HTTP uses like NFS, SMB, ISCSI, etc.?

A. Originally,
when Google kicked off QUIC, the web was the big customer for this protocol.
This is still the case at the moment, the entire protocol design is very much
driven by carrying web traffic better than TLS over TCP can. However, there’s a
strong interest from a bunch of organizations to run other applications and workloads
on top of QUIC, for example, Microsoft has recently been talking about shipping
SMB over QUIC. I fully expect we’re going to see other protocols that want to
run on top of QUIC in the near future.

Q. Have
you mentioned which browsers (or other software) support QUIC?

A. At
the moment, Chrome, which supports Google QUIC, although that is quickly turning
into IETF QUIC with every new Chrome release. Firefox is implementing IETF QUIC
and I think is shipping it as part of their nightly builds. Everybody else is Chrome-
or Chromium-based, so Microsoft Edge, Safari, etc. will all get it from Chrome
and can then enable it at their leisure.

Q.
How robust is QUIC to packets loss?

A. Currently,
QUIC uses TCP congestion control algorithms, so it’s very comparable to TCP.
And like TCP, it doesn’t do forward error correction, which was something that
Google QUIC did initially.

Q. Can
you explain the term “ossification”?

A. Basically,
it means that that the network makes – often too narrow – assumptions about
what “valid” traffic for a given protocol should look like, based on past and
current traffic patterns. This limits evolvability of a protocol, i.e., the
network “ossifies” that protocol. For example, TCP has only had a small set of
TCP options that had been defined, and various middleboxes in the network
therefore dropped TCP packets with options they didn’t recognize from the past.
Some of these middleboxes might eventually be updated, but enough won’t be that
TCP options – TCP’s main extension mechanism – has become much less useful than
envisioned. The situation is worse when trying to redefine meaning for header
bits that were originally specified as reserved. What we have learned from this
is that a protocol must carefully limit the amount of plain text bits it
exposes to the network if it wants to retain long-term evolvability, which is a
key goal for QUIC.

Q. What
does the acronym QUIC stand for?

A. It’s
actually not an acronym anymore., When Jim Roskind came up with Google QUIC, it
originally expanded to “Quick UDP Internet Connections,” but everyone since
decided that QUIC is simply the name of the protocol and not an acronym anymore.

Q. Given
that QUIC is still based on IP and UDP, wouldn’t the middlebox issues remain?

A. No,
it wouldn’t, at least not to the degree it does for TCP. UDP is a very minimal
protocol, and while middleboxes can drop UDP entirely, which would break QUIC,
everything else they might do (rewrite IP addresses and port numbers for NAT),
QUIC can handle. One caveat is that UDP was traditionally mostly used for DNS,
so many middleboxes use shorter binding lifetimes for UDP flows, but QUIC can
deal with that as well. Specifically, there are some measurements that UDP
works on about 95% of all paths, and there’s some anecdotal evidence that where
it doesn’t work it’s typically because it’s enterprise networks that just block
UDP completely.

Q. Do
you expect a push-back from network vendors or governments when they realize
that they can no longer do deep packet inspection and modification?

A. Yes,
we do, and we’ve seen it heavily already. So, the question is who can push
harder. There’s a big group of US banks that showed up in the IETF to complain
about TLS 1.3 enabling forward secrecy because, if I recall correctly, a whole bunch
of their compliance checks were based around taking traces of TLS 1.1 and 1.2 and
storing them then decrypting them later, which TLS 1.3 makes impossible. They
were not happy, but it’s the right thing to do for the web.

Q. Can
you explain where the latency/performance benefit comes from? Is it because UDP
replaces TCP and a lightweight implementation is possible?

A. A
lot of it comes from a faster handshake. You have these TLS session tickets
that let you basically send your “GET” with the first handshake packet to the
server and have the server return data within its first packets. That’s where a
lot of the latency benefits come from. In terms of bulk throughput there’s actually
not a whole lot of benefit because we’re just using TCP congestion control. So,
if you want to push a lot of bytes performance is going to be more or less the same
between QUIC and TCP. After a few hundred KB or so it doesn’t really matter
anymore what you’re using. For very fast paths, e.g., in datacenters, until we
see some NIC support for crypto offload and other QUIC operations, QUIC is not
going to be able to compete with TCP when it comes to high-speed bulk data.
QUIC adds another AES operation in addition to basic TLS which makes it hard to
offload to current-generation NICs. This will change and I think this
bottleneck will disappear, but at the moment QUIC is not your protocol if you
want to do datacenter bulk data.

Q. Do
you have any measurements comparing the energy/battery use of current QUIC
implementations compared to the traditional stack for mobile platforms?

A. Not
at the moment. Sorry.

Q. How
do you guarantee reliability with QUIC? Wouldn’t we have to borrow from TCP
here as well?

A. We’re
borrowing exactly the same concepts that TCP uses for its reliability
mechanisms. UDP, on the other hand, is built on the idea of “sending a packet
and forgetting about it.” A message sent via UDP will be delivered to the
recipient (not guaranteed, with some probability of success). QUIC detects and
recovers from UDP loss.

Q.
Understand and agree on not having protocol in kernel, and ability to rapidly
evolve (QUIC update in application rollout). However, what about the security
implications of this? It seems like it creates massive holes in the security
paradigm

A. It
depends. If you trust your kernel sure, but I think actually a lot of
applications are happy to do that in the app and only trust the kernel with
already encrypted data. So, it is changing the paradigm a little bit. But with
TLS, until very recently, it already happened at the application layer. We’ve
only recently seen TLS NIC support.

Q. Your
layer diagram shows QUIC taking over some HTTP functions and/or changing the
SAP between OSI layer 4 and layer 7. Will this be a problem to having other
protocols such as SNMP, FTP, etc. adopting QUIC?

A. Yes, I think that
might be an artifact in the diagram. This is something that changed in the working
group. In the beginning, when we started standardizing QUIC, we talked about an
application and QUIC, the application was the thing on top of HTTP and QUIC was
providing HTTP semantics and a transport protocol. That view has now somewhat
evolved. Now when we’re talking about an application and QUIC, HTTP is the
application and QUIC the transport protocol for it – and there will be other
applications on top of QUIC, so that diagram might have been a little bit stale
or maybe I have not updated it in a while, but the model very much is that QUIC
at this time intends to be a transport protocol that is general purpose
although with a bunch of features that are inspired by features that the web
needs, but that are hopefully useful for other protocols. So, there should be a
relatively clean interface that other applications can layer on top of.

Q.
Does QUIC provides a Forward Error Correction (FEC) option?

A. Not
at the moment. Google QUIC did initially, but reported mixed results and
therefore the current IETF QUIC does not use FEC. However, there’s now a better
understanding of a different flavor of FEC that might actually be interesting. We’ve
talked to some people that want to revisit that decision and maybe add FEC it back.

Q.
How would QUIC apply for non-HTTP protocols? In particular, data-centric
protocols like SMB, NFS or iSCSI?

A. If
you can run over TCP, you can run over QUIC, because QUIC sort of degrades into
TCP in a sense if you only use one stream, and then you use that one stream on
one connection that you’re basically having TCP-like transfer protocols. If you
can run on top of that, you can run on top of QUIC without changing your
application protocol too much. If you want to take full advantage of QUIC,
specifically the multiple parallel streams and prioritization and all that, you
will need to change your application protocol. If you have an application
protocol that can run on top of SCTP, that binding is going to be very similar to
a QUIC binding.

Q. Do
/ Will corporate middleboxes block QUIC to preserve inspection abilities?

A. All
the things CSOs rely on to protect enterprise networks become harder to use
because they can’t see the traffic, let alone filter or block traffic. These
changes pose challenges for regulated industries such as financial services
where the organizations have to archive all incoming and outgoing
communications for compliance purposes.

Q. Do
standard HTTP engines/servers like Nginx support QUIC?

A. As
of May 2019, Nginx announced starting the development for support of QUIC.
Many other servers such as h2o, lightspeed, etc. will also start supporting
QUIC.

Q. Do
typical QUIC implementations support POSIX socket APIs yet? If not, is there a
plan to incorporate POSIX API wrappers over QUIC APIs that may allow more POSIX
compliant?

A. No QUIC stack
that I know of has a POSIX abstraction API. Some have APIs that are somewhat
inspired by POSIX, but not to a degree where you could simply link against a
QUIC stack. One key reason is that if you want to maximize performance and
minimize latencies, the POSIX abstractions actually get in your way and make it
more difficult. Applications that want to optimize performance need to tie in
very deeply and directly with their transport stacks.

Q. Does
QUIC have a better future than Fibre Channel over Ethernet (FCoE) has had till
now?

A. Those are two
protocols with vastly different scopes of applicability. QUIC’s future
certainly seems very bright at least in the web ecosystem – pretty much all
players plan on migrating to HTTP/3 on top of QUIC.

Q. How
does QUIC affect current hardware deployments?

A. I don’t see QUIC
necessitating changes here.

Q. How
is SNI handled for web hosting of multiple domains on one server?

A. QUIC uses the SNI
in exactly the same way as TLS.

Q. How
does QUIC perform compared to TCP for a locally scoped IoT network (or say ad hoc
network made of mobile devices)?

A. I’m not aware of
a comparison of QUIC and TCP/TLS traffic on IoT networks. I have deployed my
QUIC stack on two embedded platforms (RIOT-OS and Particle DeviceOS), so it is
feasible to deploy QUIC on at least the higher-end of embedded boards, but I
have not had time to do a full performance analysis. (See https://eggert.org/papers/2020-ndss-quic-iot.pdf for what I
measured.)

Q. Do end devices need to be
adapted to QUIC, if yes, how?

A.
No. If the
system allows applications to send and receive UDP traffic, QUIC can be
deployed on them.

Q.
Is it possible to implement QUIC in an IoT environment considering the CPU and
memory cost of QUIC and its code size?

A.
Yes. I have a
proof-of-concept of my QUIC stack on two IoT systems, where a simple client
app, QUIC and TLS together use about 64KB of flash and maybe 10-20 KB of RAM.
See https://eggert.org/papers/2020-ndss-quic-iot.pdf.

Q.
Is the QUIC API exposed to applications (i.e., other than HTTP) a message-based
(e.g., like UDP) or byte stream (e.g., like TCP)?

A.
There really
is no common QUIC API that multiple different stacks would all implement. Each
stack defines its own API, which is tailored to the needs of the specific
applications it intends to support.

Q.
My understanding is that TCP is ‘consistent’ across all implementations, but I
see individual versions from each vendor involved currently (and interop being
tested, etc.) – why are individual/custom variants required?

A.
All vendors
are implementing the current version of IETF QUIC. QUIC makes it very easy to
negotiate use of a private or proprietary variant during the standard
handshake, and some vendors may eventually use that to migrate away from
standard QUIC. We’re certainly testing that capability during interop, but I’m
not aware of anyone planning on shipping proprietary versions at the moment.

Q.
SMB over QUIC comment: I can’t speak for Microsoft, of course, but I have been
through some of their presentations on SMB over QUIC. One feature of using QUIC
is connection stability, particularly over WiFi. The QUIC connection can
survive a transfer from one Access Point to another on different routers, for
example.

A.
Yes, QUIC
uses connection identifiers instead of IP addresses and ports to identify
connections, so QUIC connections can survive changes to those, such as when
access networks are changed.

Q.
So QUIC is basically utilizing UDP in a new way?

A.
Not really.
QUIC is using UDP to send packets just as any other application would.

Q.
To be clearer on my security concerns. I’m thinking of malicious apps/actors
doing /hiding data exfiltration inside the new QUIC environment/protocol/etc.
ie the very problems with our legacy environment, also provides the ability to
inspect and prevent inappropriate data transmission. How to do this with QUIC?

A.
You need to
have control of the endpoint and make the QUIC stack export TLS keying
material.

Q.
UDP + CC + TLS + HTTP = QUIC. What does “CC” stand for?

A.
Congestion
control.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Internet Protocols QUIC TCP/IP

Blog

Encryption 101: Keeping Secrets Secret

Encryption 101: Keeping Secrets Secret

Alex McDonald

Apr 20, 2020

But how does encryption actually work, and how is it managed? How do we ensure security and protection of our data, when all we can keep as secret are the keys to unlock it? How do we protect those keys; i.e., "Who will guard the guards themselves?"

It's a big topic that we're breaking down into three sessions as part of our Storage Networking Security Webcast Series: Encryption 101, Key Management 101, and Applied Cryptography.

Join us on May 20^th for the first Encryption webcast: Storage Networking Security: Encryption 101 where our security experts will cover:

A brief history of Encryption
Cryptography basics
Definition of terms – Entropy, Cipher, Symmetric & Asymmetric Keys, Certificates and Digital signatures, etc.
Introduction to Key Management

I hope you will register today to join us on May 20^th. Our experts will be on-hand to answer your questions.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Storage

Blog

Encryption 101: Keeping Secrets Secret

Encryption 101: Keeping Secrets Secret

AlexMcDonald

Apr 20, 2020

Encryption has been used through the ages to protect information, authenticate messages, communicate secretly in the open, and even to check that messages were properly transmitted and received without having been tampered with. Now, it’s our first go-to tool for making sure that data simply isn’t readable, hearable or viewable by enemy agents, smart surveillance software or other malign actors. But how does encryption actually work, and how is it managed? How do we ensure security and protection of our data, when all we can keep as secret are the keys to unlock it? How do we protect those keys; i.e., “Who will guard the guards themselves?” It’s a big topic that we’re breaking down into three sessions as part of our Storage Networking Security Webcast Series: Encryption 101, Key Management 101, and Applied Cryptography. Join us on May 20^th for the first Encryption webcast: Storage Networking Security: Encryption 101 where our security experts will cover:

A brief history of Encryption
Cryptography basics
Definition of terms – Entropy, Cipher, Symmetric & Asymmetric Keys, Certificates and Digital signatures, etc.
Introduction to Key Management

I hope you will register today to join us on May 20^th. Our experts will be on-hand to answer your questions.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Data Protection data security encryption Storage

Blog

Share Your Experiences in Programming PM!

Share Your Experiences in Programming PM!

Marty Foltyn

Apr 14, 2020

by Jim Fister, SNIA Director of Persistent Memory Enabling Last year, the University of California San Diego (UCSD) Non-Volatile Systems Lab (NVSL) teamed with the Storage Networking Industry Association (SNIA) to launch a new conference, Persistent Programming In Real Life (PIRL). While not an effort to set the record for acronyms in a conference announcement, we did consider it a side-goal. The PIRL conference was focused on gathering a group of developers and architects for persistent memory to discuss real-world results. We wanted to know what worked and what didn’t, what was hard and what was easy, and how we could help more developers move forward. You don’t need another pep talk about how the world has changed and all the things you need to do (though staying home and washing your hands is a pretty good idea right now). But if you’d like a pep talk on sharing your experiences with persistent memory programming, then consider this just what you need. We believe that continuing the spirit of PIRL — discussing the results of persistent memory programming in real life — should continue. If you’re not aware, SNIA has been delivering some very popular webcasts on persistent programming, cloud storage, and a variety of other topics. SNIA has a great new webcast featuring PIRL alumni Steve Heller, SNIA CMSI co-chair Alex McDonald, and me on the SNIA NVDIMM programming challenge and the winning entry. You can find more information and check the on-demand viewing at https://www.brighttalk.com/webcast/663/389451. We would like to highlight more “In Real Life” topics via our SNIA webcast channel. Therefore, SNIA and UCSD NVSL have teamed up to create a submission portal for anyone interested in discussing their real-world persistent memory experiences. You can submit a topic here https://docs.google.com/forms/d/e/1FAIpQLSe_Ypo_sf1xxFcPD1F7se02jOWrdslosUnvwyS0RwcQpWAHiA/viewform where we will evaluate your submission. Acceptable submissions will be featured in conjunction with the SNIA channel over the coming months. As a final note, this year’s PIRL conference is currently scheduled for July. Even though most software developers are already used to social isolation and distancing from their peers, our organizing team has kept abreast of all the latest information to make a decision on the capability to do an in-person conference on that date. In our last meeting, we agreed that it would not be prudent to hold the conference on the July date, and have tentatively rescheduled the in-person conference to October 13-14 of 2020. We will announce an exact date and our criteria for moving forward on that date in the coming weeks, so stay tuned!

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Uncategorized

Blog

25 Questions (and Answers) on Ethernet-attached SSDs

25 Questions (and Answers) on Ethernet-attached SSDs

Ted Vojnovich

Apr 14, 2020

The SNIA Networking Storage Forum celebrated St. Patrick’s Day by hosting a live webcast, “Ethernet-attached SSDs – Brilliant Idea or Storage Silliness?” Even though we didn’t serve green beer during the event, the response was impressive with hundreds of live attendees who asked many great questions – 25 to be exact. Our expert presenters have answered them all here: Q. Has a prototype drive been built today that includes the Ethernet controller inside the NVMe SSD? A. There is an Interposing board that extends the length by a small amount. Integrated functionality will come with volume and a business case. Some SSD vendors have plans to offer SSDs with fully-integrated Ethernet controllers. Q. Costs seem to be the initial concern… true apples to apples between JBOF? A. Difference is between a PCIe switch and an Ethernet switch. Ethernet switches usually cost more but provide more bandwidth than PCIe switches. An EBOF might cost more than a JBOF with the same number of SSDs and same capacity, but the EBOF is likely to provide more performance than the JBOF. Q. What are the specification names and numbers. Which standards groups are involved? A. The Native NVMe-oF Drive Specification from the SNIA is the primary specification. A public review version is here. Within that specification, multiple other standards are referenced from SFF, NVMe, and DMTF. Q. How is this different than “Kenetic”, “Object Storage”, etc. effort a few years ago? Is there any true production quality open source available or planned (if so when), if so, by whom and where? A. Kinetic drives were hard disks and thus did not need high speed Ethernet. In fact, new lower-speed Ethernet was developed for this case. The pins chosen for Kinetic would not accommodate the higher Ethernet speeds that SSDs need, so the new standard re-uses the same lanes defined for PCIe for use by Ethernet. Kinetic was a brand-new protocol and application interface rather than leveraging an existing standard interface such as NVMe-oF. Q. Can OpenChannel SSDs be used as EBOF? A. To the extent that Open Channel can work over NVMe-oF it should work. Q. Define the Signal Integrity challenges of routing Ethernet at these speeds compared to PCIe. A. The signal integrity of the SFF 8639 connector is considered good through 25Gb Ethernet. The SFF 1002 connector has been tested to 50Gb speeds with good signal integrity and may go higher. Ethernet is able to carry data with good signal integrity much farther than a PCIe connection of similar speed. Q. Is there a way to expose Intel Optane DC Persistent Memory through NVMe-oF? A. For now, it would need to be a block-based NVMe device. Byte addressability might be available in the future Q. Will there be interposer to send the Block IO directly over the Switch? A. For the Ethernet Drive itself, there is a dongle available for standard PCIe SSDs to become an Ethernet Drive that supports block IO over NVMe-oF. Q. Do NVMe drives fail? Where is HA implemented? I never saw VROC from Intel adopted. So, does the user add latency when adding their own HA? A. Drive reliability is not impacted by the fact that it uses Ethernet. HA can be implemented by dual port versions of Ethernet drives. Dual port dongles are available today. For host or network-based data protection, the fact that Ethernet Drives can act as a secondary location for multiple hosts, makes data protection easier. Q. Ethernet is a contention protocol and TCP has overhead to deliver reliability. Is there any work going on to package something like Fibre Channel/QUIC or other solutions to eliminate the downsides of Ethernet and TCP? A. FC-NVMe has been approved as a standard since 2017 and is available and maturing as a solution. NVMe-oF on Ethernet can run on RoCE or TCP with the option to use lossless Ethernet and/or congestion management to reduce contention, or to use accelerator NICs to reduce TCP overhead. QUIC is growing in popularity for web traffic but it’s not clear yet if QUIC will prove popular for storage traffic. Q. Is Lenovo or other OEM’s building standard EBOF storage servers? Is OCP having a work group on EBOF supporting hardware architecture and specification? A. Currently, Lenovo does not offer an EBOF. However, many ODMs are offering JBOFs and a few are offering EBOFs. OCP is currently focusing on NVMe SSD specifics, including form factor. While several JBOFs have been introduced into OCP, we are not aware of an OCP EBOF specification per se. There are OCP initiatives to optimize the form factors of SSDs and there are also OCP storage designs for JBOF that could probably evolve into an Ethernet SSD enclosure with minimal changes. Q. Is this an accurate statement on SAS latency. Where are you getting and quoting your data? A. SAS is a transaction model, meaning the preceding transaction must complete before the next transaction can be started (QD does ameliorate this to some degree but end points still have to wait). With the initiator and target having to wait for the steps to complete, overall throughput slows. SAS HDD = milliseconds per IO (governed by seek and rotation); SAS SSD = 100s of microseconds (governed by transaction nature); NVMe SSD = 10s of microseconds (governed by queuing paradigm). Q. Regarding performance & scaling, a 50GbE has less bandwidth than a PCIe Gen3 x4 connection. How is converting to Ethernet helping performance of the array? Doesn’t it face the same bottleneck of the NICs connecting the JBOF/eBOF to the rest of the network? A. It eliminates the JBOF’s CPU and NIC(s) from the data path and replaces them with an Ethernet switch. Math: 1P 50G = 5GBps 1P 4X Gen 3 = 4 GBps,. because PCIe Gen 3 = 8 Gbps per lane so a single 25GbE NIC is usually connected to 4 lanes of PCIe Gen3 and a single 50GbE NIC is usually connected to 8 lanes of PCIe Gen3 (or 4 lanes of PCIe Gen4). But that is half of the story: 2 other dimensions to consider. First, getting all this BW (either way) out the JBOF vs. an EBOF. Second, at the solution level, all these ports (connectivity) and scaling (bandwidth) present their own challenges Q. What about Persistent Memory? Can you present Optane DC through NVMe-Of? A. Interesting idea!!! Today persistent memory DIMMs sit on memory bus so they would not benefit directly from Ethernet architecture. But with the advent of CXL and PCIe Gen 5, there may be a place for persistent memory in “bays” for a more NUMA-like architecture Q. For those of us that use Ceph, this might be an interesting vertical integration, but feels like there’s more to the latency to “finding” and “balancing” the data on the arrays of Ethernet-attached NVMe. Has there been any software suites to accompany this hardware changes and are whitepapers published? A. Ceph nodes are generally slower (for like-to-like HW than non-Ceph storage solutions, so Ceph might be less likely to benefit from Ethernet SSDs, especially NVMe-oF SSDs. That said, if the cost model for ESSDs works out (really cheap Ceph nodes to overcome “throwing HW at the problem”), one could look at Ceph solutions using ESSDs, either via NVMe-oF or by creating ESSDs with a key-value interface that can be accessed directly by Ceph. Q. Can the traditional Array functions be moved to the LAN switch layer, either included in the switch (~the Cisco MDS and IBM SVC “”experiment””) or connect the controller functionality to the LAN switch backbone with the SSD’s in a separate VLAN? A. Many storage functions are software/firmware driven. Certainly, a LAN switch with a rich X86 complex could do this…or…a server with a switch subsystem could. I can see low level storage functions (RAID XOR, compression, maybe snapshots) translated to switch HW, but I don’t see a clear path for high level functions (dedupe, replication, etc) translated to switch HW. However, since hyperscale does not perform many high-level storage functions at the storage node, perhaps enough can be moved to switch HW over time. Q. ATA over Ethernet has been working for nearly 18 years now. What is the difference? A. ATA over Ethernet is more of a work group concept and has never gone mainstream (to be honest, your question is the first time I heard this since 2001). In any event, ATA does not take advantage of queuing nature of NVMe so it’s still held hostage by transaction latency. Also, no high availability (HA) in ATA (at least I am not aware of any HA standards for ATA), which presents a challenge because HA at the box or storage controller level does NOT solve the SPOF problem at the drive level. Q. Request for comment – Ethernet 10G, 25G, 50G, 100G per lane (all available today), and Ethernet MAC speeds of 10G, 25G, 40G, 50G, 100G, 200G, 400G (all available today), Ethernet is far more scalable compared to PCIe. Comparing Ethernet Switch relative cost to PCIe switch, Ethernet Switch is far more economical. Why shouldn’t we switch? A. Yes Ethernet is more scalable than PCIe, but 3 things need to happen. 1) Solution level orchestration has to happen (putting an EBOF behind an RBOF is okay but only the first step); 2) The Ethernet world has to start understanding how storage works (multipathing, ACLs, baseband drive management, etc.); 3) Lower cost needs to be proven–jury still out on cost (on paper, it’s a no brainer, but costs of the Ethernet switch in the I/O Module can rival an X86 complex). Note that Ethernet with 100Gb/s per lane is not yet broadly available as of Q2 2020. Q. We’ve seen issues with single network infrastructure from an availability perspective. Why would anyone put their business at risk in this manner? Second question is how will this work with multiple vendor hosts or drive vendors, each having different specifications? A. Customers already connect their traditional storage arrays to either single or dual fabrics, depending on their need for redundancy, and an Ethernet drive can do the same, so there is no rule that an Ethernet SSD must rely on a single network infrastructure. Some large cloud customers use data protection and recovery at the application level that spans multiple drives (or multiple EBOFS), providing high levels of data availability without needing dual fabric connections to every JBOF or to every Ethernet drive. For the second part of the question, it seems likely that all the Ethernet drives will support a standard Ethernet interface and most of them will support the NVMe-oF standard, so multiple host and drive vendors will interoperate using the same specifications. This is already been happening through UNH plug fests at the NIC/Switch level. Areas where Ethernet SSDs might use different specifications might include a key-value or object interface, computational storage APIs, and management tools (if the host or drive maker don’t follow one of the emerging SNIA specifications). Q. Will there be a Plugfest or certification test for Ethernet SSDs? A. Those Ethernet SSDs that use the NVMe-oF interface will be able to join the existing UNH IOL plugfests for NVMe-oF. Whether there are plugfests for any other aspects of Ethernet SSDs–such as key-value or computational storage APIs–likely depends on how many customers want to use those aspects and how many SSD vendors support them. Q. Do you anticipate any issues with mixing control (Redfish/Swordfish) and data over the same ports? A. No, it should be fine to run control and data over the same Ethernet ports. The only reason to run management outside of the data connection would be to diagnose or power cycle an SSD that is still alive but not responding on its Ethernet interface. If out-of-band management of power connections is required, it could be done with a separate management Ethernet connection to the EBOF enclosure. Q. We will require more Switch ports would it mean more investment to be spent Also how is the management of Ethernet SSD’s done. A. Deploying Ethernet SSDs will require more Ethernet switch ports, though it will likely decrease the needed number of other switch or repeater ports (PCIe, SAS, Fibre Channel, InfiniBand, etc.). Also, there are models showing that Ethernet SSDs have certain cost advantages over traditional storage arrays even after including the cost of the additional Ethernet switch ports. Management of the Ethernet SSDs can be done via standard Ethernet mechanisms (such as SNMP), through NVMe commands (for NVMe-oF SSDs), and through the evolving DTMF Redfish/SNIA Swordfish management frameworks mentioned by Mark Carlson during the webcast. You can find more information on SNIA Swordfish here. Q. Is it assumed that Ethernet connected SSDs need to implement/support congestion control management, especially for cases of overprovision in EBOF (i.e. EBOF bandwidth is less than sum of the underlying SSDs under it)? If so – is that standardized? A. Yes, but both NVMe/TCP and NVMe/RoCE protocols have congestion management as part of the protocol, so it is baked in. The eSSDs can connect to either a switch inside the EBOF enclosure or to an external Top-of-Rack (ToR) switch. That Ethernet switch may or may not be oversubscribed, but either way the protocol-based congestion management on the individual Ethernet SSDs will kick in if needed. But if the application does not access all the eSSDs in the enclosure at the same time, the aggregate throughput from the SSDs being used might not exceed the throughput of the switch. If most or all of the SSDs in the enclosure will be accessed simultaneously, then it could make sense to use a non-blocking switch (that will not be oversubscribed) or rely on the protocol congestion management. Q. Are the industry/standards groups developing application protocol (IOS layers 5 thru 7) to allow customers to use existing OS/Apps without modification? If so when will these be available and via what delivery to the market such as new IETF Application Protocol, Consortium,…? A. Applications that can directly use individual SSDs can access a NVMe-oF Ethernet SSD directly as block storage, without modification and without using any other protocols. There are also software-defined storage solutions that already manage and virtualize access to NVMe-oF arrays and they could be modified to allow applications to access multiple Ethernet SSDs without modifications to the applications. At higher levels of the IOS stack, the computational storage standard under development within SNIA or a key-value storage API could be other solutions to allow applications to access Ethernet SSDs, though in some cases the applications might need to be modified to support the new computational storage and/or key-value APIs. Q. In an eSSD implementation what system element implements advanced features like data streaming and IO determinism? Maybe a better question is does the standard support this at the drive level? A. Any features such as these that are already part of NVMe will work on Ethernet drives.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

ESSD ethernet ethernet NVMe Solid State Storage

Blog

The Challenges IoT Brings to Storage and Data Strategy

The Challenges IoT Brings to Storage and Data Strategy

Alex McDonald

Apr 13, 2020

This new source of IoT data and information brings with it unique challenges to the way we store and transmit data as well as the way we need to curate it. It’s the topic the SNIA Cloud Storage Technologies Initiative will tackle at our live webcast on May 14, 2020, The influence of IoT on Data Strategy. In this webcast we will look at:

New patterns generated by the explosion of the Internet of Things
How IoT is impacting storage and data strategies
Security and privacy issues and considerations
How to think about the lifecycle of our information in this new environment

The SNIA experts presenting are sure to offer new insights into the challenges IoT presents. And since this will be live, they’ll be on-hand to answer your questions on the spot. Register today. We hope you’ll join us.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Cloud Storage Data Protection

Blog

The Challenges IoT Brings to Storage and Data Strategy

The Challenges IoT Brings to Storage and Data Strategy

Alex McDonald

Apr 13, 2020

Data generated from the Internet of Things (IoT) is increasing exponentially. More and more we are seeing compute and inference move to the edge. This is driven by the growth in capability to not only generate data from sensors, devices, and by people operating in the field, but also by the interaction between those devices. This new source of IoT data and information brings with it unique challenges to the way we store and transmit data as well as the way we need to curate it. It’s the topic the SNIA Cloud Storage Technologies Initiative will tackle at our live webcast on May 14, 2020, The influence of IoT on Data Strategy. In this webcast we will look at:

New patterns generated by the explosion of the Internet of Things
How IoT is impacting storage and data strategies
Security and privacy issues and considerations
How to think about the lifecycle of our information in this new environment

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Cloud Storage Data Protection data center Internet Internet

Blog

Addressing Cloud Security Threats with Standards

Addressing Cloud Security Threats with Standards

Eric Hibbard

Apr 8, 2020

In a recent SNIA webinar, Cloud Standards: What They Are, Why You Should Care, the SNIA Cloud Storage Technologies Initiative (CSTI) highlighted some of the key cloud computing standards being developed and published by the ISO/IEC JTC 1/SC 38 (Cloud Computing and Distributed Platforms) and SC 27 (Information security, cybersecurity and privacy protection) standards committees. While ISO and IEC are not the only organizations producing cloud computing standards and specifications (e.g., ITU-T, OASIS, NIST, ENISA, SNIA, etc.), their standards, sometime developed jointly with ITU-T, can play a role in addressing WTO Agreement on Technical Barriers to Trade (TBT) issues. More importantly, they provide a baseline of cloud terminology, concepts, guidance/requirements, and expectations that are recognized internationally. Cloud Terminology As highlighted in the SNIA CSTI webinar, establishing a common cloud vocabulary was an early concern because several software providers invoked a bit of cloud washing, which injected confusion into the market space. ISO/IEC 17788 | ITU-T Y.3500 (Cloud computing – Overview and vocabulary), which drew heavily on NIST Special Publication 800-145 (The NIST Definition of Cloud Computing), and ISO/IEC 17789 | ITU-T Y.3502 (Cloud computing – Reference architecture) clarified many aspects of cloud computing (e.g., key characteristics, deployment models, roles and activities, service categories, frameworks, etc.). Since their publication, however, there have been many developments and clarifications within cloud, so SC 38 is working to capture these details in a new multi-part standard, ISO/IEC 22123, with Part 1 focused on cloud terminology and Part 2 expanding the cloud concepts; look for Part 1 later in 2020. Both ISO/IEC 17788 and ISO/IEC 17789 are available at no cost from the ISO web site (see https://standards.iso.org/ittf/PubliclyAvailableStandards/) as well as the ITU-T SG13 web site (see https://www.itu.int/en/ITU-T/studygroups/2017-2020/13/Pages/default.aspx). Cloud Computing – SLA Framework Another cloud standard highlighted in the SNIA CSTI webinar was the multi-part, ISO/IEC 19086 (Cloud computing – Service level agreement (SLA) framework). This service and vendor-neutral standard offers a unified set of considerations for organizations to help them make decisions about cloud adoption, as well as create a common ground for comparing cloud service offerings. Part 1 establishes a set of common cloud SLA building blocks (concepts, terms, definitions, contexts) that can be used to create cloud SLAs. Part 2 defines a model for specifying metrics for cloud SLAs. Part 3 specifies the core conformance requirements for SLAs for cloud services based on Part 1 and guidance on the core conformance requirements. Part 4 specifies conformance requirements for SLAs that address security and protection of PII components. Both parts ISO/IEC 19086-1 and ISO/IEC 19086-2 are available at no cost from the ISO web site (see https://standards.iso.org/ittf/PubliclyAvailableStandards/). Security Techniques for Supplier Relationships The next standard highlighted in the webinar was the ISO/IEC 27036 (Security techniques – Information security for supplier relationships). As the title implies, this multi-part standard offers guidance on the evaluation and treatment of information risks involved in the acquisition of goods and services from suppliers (i.e., supply chain security).

Part 1 (Overview and concepts) provides general background information and introduces the key terms and concepts in relation to information security in supplier relationships, including information risks commonly arising from or relating to business relationships between acquirers and suppliers.
Part 2 (Requirements) specifies fundamental information security requirements pertaining to business relationships between suppliers and acquirers of various products (goods and services); although Part 2 contains requirements, the document explicitly states that it is not intended for certification purposes.
Part 3 (Guidelines for ICT supply chain security) guides both suppliers and acquirers of ICT goods and services on information risk management relating to the widely dispersed and complex supply chain (e.g., malware, counterfeit products, organizational risks); Part 3 does not address business continuity management.
Part 4 (Guidelines for security of cloud services) guides cloud providers and customers on gaining visibility into the information security risks associated with the use of cloud services and managing those risks effectively, and responding to risks specific to the acquisition or provision of cloud services that can have an information security impact on organizations using these services. SC 27 has initiated efforts to revise ISO/IEC 27036, but new versions are unlikely to be available before 2023. ISO/IEC 27036-1 is available at no cost from the ISO web site (see https://standards.iso.org/ittf/PubliclyAvailableStandards/).

Cloud Security & Privacy The last group of cloud standards covered webinar were a few from SC 27 that are related to cloud security and privacy. ISO/IEC 27017 | ITU-T X.1631 (Security techniques – Code of practice for information security controls based on ISO/IEC 27002 for cloud services) provides both cloud customers and providers with additional information security controls and implementation advice beyond that provided in ISO/IEC 27002, in the cloud computing context; this document was not intended to certify the security of cloud service providers specifically because they can be certified compliant with ISO/IEC 27001, like any other organization. ISO/IEC 27018 (Security techniques – Code of practice for protection of Personally Identifiable Information (PII) in public clouds acting as PII processors) expands upon ISO/IEC 27002 and provides guidance aimed at ensuring that cloud service providers (public cloud) offer suitable information security controls to protect the privacy of their customers’ clients by securing PII entrusted to them. ISO/IEC 27040 (Security techniques – Storage security) provide guidance on securing most forms of storage technology, which cloud is often dependent on, as well as specifically addressing cloud storage. SC 27 has initiated efforts to revise ISO/IEC 27040, but new a version is unlikely to be available before 2023. While not specific to cloud, the webinar also covered ISO/IEC 27701 (Security techniques — Extension to ISO/IEC 27001 and to ISO/IEC 27002 for privacy information management — Requirements and guidelines) because its recent publication is likely to have an impact on ISO/IEC 27018, especially since certified compliance with this standard is under discussion within SC 27. But Wait, There’s More There are several other published cloud standards, technical reports (TR), and technical specifications (TS) that were not addressed in the webinar including:

ISO/IEC 17826:2012, Information technology — Cloud Data Management Interface (CDMI)
ISO/IEC 19941:2017, Information technology — Cloud computing — Interoperability and portability
ISO/IEC 19944:2017, Information technology — Cloud computing — Cloud services and devices: data flow, data categories and data use
ISO/IEC 22624:2020, Information Technology — Cloud Computing — Taxonomy based data handling for cloud services
ISO/IEC TR 22678:2019, Information Technology — Cloud Computing — Guidance for policy development
ISO/IEC TS 23167:2018, Information Technology — Cloud Computing — Common technologies and techniques

ISO/IEC TR 23186:2018, Information Technology — Cloud Computing — Framework of trust for processing of multi-sourced data
ISO/IEC TR 23188:2020, Information Technology — Cloud Computing — Edge computing landscape

Additionally, there are several other cloud projects in various stages of development, including:

ISO/IEC AWI TR 3445, Information technology — Cloud computing — Guidance and best practices for cloud audits
ISO/IEC TR 23187, Information Technology — Cloud Computing — Interacting with cloud service partners (CSNs)
ISO/IEC 23613, Information Technology — Cloud Computing — Cloud service metering elements and billing modes
ISO/IEC 23751, Information Technology — Cloud Computing and distributed platforms — Data sharing agreement (DSA) framework
ISO/IEC 23951, Information Technology — Cloud Computing — Guidance for using the cloud SLA metric model

Cloud standardization continues to be an active area of work for ISO and there are likely to be many more standards to come.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

cloud Cloud Standards Cloud Storage

Blog

Object Storage Questions: Asked and Answered

Object Storage Questions: Asked and Answered

John Kim

Mar 20, 2020

Last month, the SNIA Networking Storage Forum (NSF) hosted a live webcast, “Object Storage: What, How and Why.” As the title suggests, our NSF members and invited guest experts delivered foundational knowledge on object storage, explaining how object storage works, use cases, and standards. They even shared a little history on how object storage originated. If you missed the live event, you can watch the on-demand webcast or find it on our SNIAVideo YouTube Channel. We received some great questions from our live audience. As promised, here are the answers to them all. Q. How can you get insights into object storage on premises? E.g. quota, who consumes what, auditing the data for data leaks, etc. Is there a tool for understanding and managing object data solutions? A.Yes, on-premises storage systems have quota management including enforcement options (and even that can be a hard or soft enforcement). As for data leaks, object storage access and consumption are always logged but even on-premises, the security model of the Internet should be used there as well (take security seriously). If you’re unsure about where to start with security, may we suggest our Storage Security Series. Q. Are object sizes a consideration if it makes sense to use object storage? A. Yes, both the size of the objects and the overall amount of data are important, especially around egress from the public cloud. This is where modeling and testing a solution will provide valuable feedback to the performance and economics of object storage for a use case. Q. I hear that object storage equals SLOW storage i.e. for backup or archives, but can object storage have high performance and if so, what use cases are there for high performance object storage? A. Object storage is not necessarily slow; in some cases it can be quite fast. It depends on how the application writes and reads data from object storage and on the media used for the object storage. A single monolithic file simply put in object storage will have a different behavior characteristic than a more data-specific placement in object storage. Different public cloud service levels (Hot, Cool, Archive for example on Azure) make a difference in performance as well. Lastly, public cloud throttling can come into play. Q. You mentioned that on-premises-deployed S3 compatible object storage solutions support object locking or enforced retention (like Amazon’s Object Lock feature). Cloudian’s HyperStore supports a fully compliant and SEC17a-4/FINRA Object Lock WORM solution. Does NetApp StorageGrid support Object Lock? A.Some on-premises can do object lock as well. We recommend that you take a specific look at each vendor’s support specifications for more details. Q. What is the minimum size of data below which object storage becomes inefficient, and other types of block and file-system storage are more efficient? A. There is no single number that would answer this question for every application. Q. How do current customers define the common meta-data format, when you have a variety of data which are hard to group? A. This would really be determined at the application level; specifically, what application is reading and writing data from object storage. Q. Can Object storage be enabled with versioning capabilities? Is there any limit on the total number of versions for an object? A. In some cases (for instance, at least for S3 storage), the version is a copy of an object. Each copy has a version. It will depend on the specific vendor’s solution as to whether there are limits of the versions, and it is best to always check the vendor’s available information for their implementation details. Q. At a high-level, how would you differentiate on premises-based object storage solutions from public, cloud offerings A. The main differences are that with on-premises solutions the customers have full control whereas the public cloud and service provider offerings are more globally accessible. Additionally, public offerings may have additional features, functionality, and service levels available. Q. Fundamentally, object storage eventually lives on block storage. For things like erasure coding for geo distributed access and protection, does the object storage engine handle that replication of data blocks on SSD/HDD storage up at the application layer? A. Each object storage provider will ensure the availability of the data in their own way at the hardware control plane level. The public cloud providers intentionally abstract the details of the hardware in many cases; the shared responsibility model however puts the ultimate control of the data on the tenant. Q. What are the leading object storage solutions in the Gartner benchmark? A.As recently as 2019, Gartner did issue a Magic Quadrant for Distributed File Systems and Object Storage which showcases the industry solutions for non-hyperscale object storage implementations. Many of the vendors allow reprints and we recommend you read the full report for those implementations.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

ethernet Object Storage

Blog

How AI Impacts Storage and IT

How AI Impacts Storage and IT

Alex McDonald

Mar 13, 2020

Artificial intelligence (AI) and machine learning (ML) have had quite the impact on most industries in the last couple of years, but what about the effect on our own IT industry? On April 1, 2020, the SNIA Cloud Storage Technologies Initiative will host a live webcast, “The Impact of Artificial Intelligence on Storage and IT, where our experts will explore how AI is changing the nature of applications, the shape of the data center, and its demands on storage. Learn how the rise of ML can develop new insights and capabilities for IT operations. In this webcast, we will explore:

What is meant by Artificial Intelligence, Machine Learning and Deep Learning?
The AI market opportunity
The anatomy of an AI Solution
Typical storage requirements of AI and the demands on the supporting infrastructure
The growing field of IT operations leveraging AI (aka AIOps)

Yes, we know this is on April 1st, but it’s no joke! So, don’t be fooled and find out why everyone is talking about AI now. Register today

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Subscribe to

An FAQ on QUIC

Find a similar article by tags

Leave a Reply

Encryption 101: Keeping Secrets Secret

Find a similar article by tags

Leave a Reply

Encryption 101: Keeping Secrets Secret

Find a similar article by tags

Leave a Reply

Share Your Experiences in Programming PM!

Find a similar article by tags

Leave a Reply

25 Questions (and Answers) on Ethernet-attached SSDs

Find a similar article by tags

Leave a Reply

The Challenges IoT Brings to Storage and Data Strategy

Find a similar article by tags

Leave a Reply

The Challenges IoT Brings to Storage and Data Strategy

Find a similar article by tags

Leave a Reply

Addressing Cloud Security Threats with Standards

Find a similar article by tags

Leave a Reply

Object Storage Questions: Asked and Answered

Find a similar article by tags

Leave a Reply

How AI Impacts Storage and IT

Find a similar article by tags

Leave a Reply