Blog

Common Questions on Clustered File Systems

Common Questions on Clustered File Systems

Nov 18, 2016

More than 350 people have already seen our SNIA Ethernet Storage Forum (ESF) webcast "Clustered File Systems: No Limits." Our presenters, James Coomer and Jerry Lotto, did a great job explaining what clustered file systems are, key considerations, choices and performance. As we expected, there were plenty of questions, so as promised, here are answers to them all. Q: Parallel NFS (pNFS) has been in development/standard effort for a long time, and I believe pNFS is not in the Linux kernel it appears pNFS is yet to be prime time. A: pNFS has been in Linux for over a decade! Clients and server are widely available, and you should look at the SNIA White Paper "An Updated Overview of NFSv4; NFSv4.0, NFSv4.1, pNFS, and NFSv4.2" for more information on the current state of play. Q: Why the emphasis on parallel I/O? Any single storage server can feed results at link capacity, so you do not need multiple storage servers to feed a client at full speed. Isn't the more critical issue the bottleneck on access to metadata for a single directory or file? Federated NAS bottlenecks updates for each directory behind a single master server? A: Any one storage server can usually saturate one client, but often there are multiple hungry clients making requests simultaneously. Using parallel I/O allows multiple servers to feed multiple high-bandwidth clients across a narrow or wide set of data. This smooths out the I/O load on the servers in a near-perfect manner regardless of the number of clients performing I/O. It is absolutely true that metadata serving can become a bottleneck, so parallel file systems use cached and/or distributed metadata to overcome this and again, every client takes part in that interaction and shares some responsibility for managing communicating metadata updates. Q: Can any application access parallel file system (i.e. through an agent in the driver level)? Or does it require specific code within the application? A: Native access to a parallel file system requires a specific client or agent in the host, but many parallel file systems allow any client to access the data through a NAS protocol gateway. No changes are needed to applications to use a parallel file system – These parallel file systems are mounted as a POSIX compliant file system and therefore adhere to basically the same standards as an NFS mount for example. Q: Are parallel file system clients compatible with scale-out NAS servers? A: Nearly all scale-out NAS servers speak a standard NAS protocol like NFS or SMB. Clients running a parallel file system client can also access NAS via these standard protocols. Exceptions to this may possibly (but none that we know of) occur for scale-out NAS servers that support a modified NFS/SMB protocol or a custom NAS client which might conceivably conflict with the parallel file system client when installed on an OS. Q: Of course I am biased, but I am fond of the AFS (Andrew File System) Family of File Systems. There is OpenAFS, but there is also what we are doing at AuriStor extending beyond the core AFS global namespace model (security functionality, and performance) A: AFS is another distributed file system which supports large scale deployments, native clients for many platforms, and strong security features. It also uses local caching of files to improve performance. It uses a weakly consistent file locking system so multiple clients can access the same file simultaneously but they cannot both update the same file at the same time. OpenAFS is an open-source implementation of AFS. Auristor (formerly Your File System, Inc.) is a startup providing a commercial parallel file system that is compatible with AFS. Q: I am more familiar with Veritas Cluster File System, could you please do a quick compare with Lustre or GPFS? A: The Veritas Cluster File System (formerly VxCFS, now part of Veritas InfoScale) is a distributed file system that runs on Linux and popular flavors of Unix. It supports up to 64 nodes and allows multiple nodes to share the same back-end storage hardware. Comparing it to Lustre and GPFS is beyond the scope of this webinar, but in basic terms, parallel file systems can offer far greater scalability and bandwidth for example, through the use of optimized RDMA clients for high performance networks. Q: Why do file apps need shared access to data, but block apps do not? A: Traditionally block storage did not offer shared access to data (except when used as shared back-end storage for a clustered file system), while apps that needed shared access to data usually chose to use a NAS protocol such as SMB or NFS. So in many cases file-based apps use file sharing protocols because they need shared access to data from multiple clients. (In other cases file-based applications do not require sharing but the storage administrators believe it's easier to manage or less expensive than networked block storage.) Q: Do Lustre and GPFS have SMB Direct support? A: Not today. SMB Direct is an option to use RDMA and multi-channel with the SMB 3 protocol. Both Lustre and GPFS support the ability to export a file system via NFS or SMB, but generally they do not support SMB Direct yet. Both Lustre and GPFS support RDMA access through their clients. How to the clients avoid doing simultaneous writes to the same file? A: Some parallel file systems allow this by letting different clients write to different parts of the same file. Others do not allow this. In either case, distributed file locking is used to prevent two clients from writing simultaneously to the same part of a file (or to the same file if it's not allowed). Q: How can you say that the application "does not have to worry about" how the clustered file system serializes writes? Doesn't this require continuous end-to-end connectivity? A: When the application writes data it generally writes to a POSIX-compliant file system and does not need to worry about how the parallel file system serializes, distributes, or protects the data because this is virtualized (managed) by the file system. It usually does require continuous end-to-end connectivity from the clients to the servers, though in some cases caching could allow for brief gaps in connectivity and in some systems not every client needs to have network connectivity to every server. There are multiple mechanisms within parallel file systems to manage the various cases of clients/servers disappearing from the network, temporarily or permanently (whilst for example holding a lock). Q: How does a parallel file system handle the sequences of write on a same file? Just append one by one? What if a client modified a line? A: This is the biggest challenge for and reason to use a parallel file system. Beneath the covers, coherency is maintained by Spectrum Scale using a token management server process which issues locks for object requests. Similar functionality is implemented in Lustre using a distributed lock manager. These objects are most commonly blocks within files rather than entire files, but this is application controlled. The end result is a POSIX-compliant interface that scales to thousands of clients. Q: What does FPO stand for? A: File Placement Optimizer – a shared-nothing architecture and licensing model for IBM Spectrum Scale (aka GPFS). Learn more here. Q: Is there a concept in parallel file systems for "auto-tuning" yet? Seems like the early days of SAN management and tuning... A: Default tuning values are optimized for general purpose workloads, but the whole purpose of tuning parameters is to adjust away from those defaults to optimize the file system for a particular application workload or fil esystem architecture. Both IBM and OpenSFS with the support of Intel have published extensive documentation on best practices for optimization and tuning for either file system. We are not aware of any work on "automating" that process but there has been recent work (e.g. in spectrum scale) to simplify the tuning process. Q: Which is better as interconnect between disk and servers, shared access or share-nothing? A: The use of shared access in the interconnect between disks and servers is limited to providing HA functionality in Lustre or Spectrum Scale, the ability to service I/O requests to a storage device if the server which has primary responsibility for that device is not available. This usually involves multiple server-attached external storage which can add cost to building the file system. The alternative approach to HA is to replicate blocks of data to different disks on different servers, cutting back on the usable capacity of the file system. If HA is not a requirement, a share-nothing architecture will generally involve less hardware and therefore be less expensive to build. If you have more questions, please comment on this blog. And I encourage you to check out the SNIA ESF webcast library for educational, vendor-neutral content on Ethernet networked storage topics. Update: If you missed the live event, it's now available on-demand. You can also download the webcast slides.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Ethernet Data Storage NFS RDMA smb

Blog

Common Questions on Clustered File Systems

Common Questions on Clustered File Systems

John Kim

Nov 18, 2016

More than 350 people have already seen our SNIA Ethernet Storage Forum (ESF) webcast “Clustered File Systems: No Limits.” Our presenters, James Coomer and Jerry Lotto, did a great job explaining what clustered file systems are, key considerations, choices and performance. As we expected, there were plenty of questions, so as promised, here are answers to them all.

Q: Parallel NFS (pNFS) has been in development/standard effort for a long time, and I believe pNFS is not in the Linux kernel it appears pNFS is yet to be prime time.

A: pNFS has been in Linux for over a decade! Clients and server are widely available, and you should look at the SNIA White Paper “An Updated Overview of NFSv4; NFSv4.0, NFSv4.1, pNFS, and NFSv4.2” for more information on the current state of play.

Q: Why the emphasis on parallel I/O? Any single storage server can feed results at link capacity, so you do not need multiple storage servers to feed a client at full speed. Isn’t the more critical issue the bottleneck on access to metadata for a single directory or file? Federated NAS bottlenecks updates for each directory behind a single master server?

A: Any one storage server can usually saturate one client, but often there are multiple hungry clients making requests simultaneously. Using parallel I/O allows multiple servers to feed multiple high-bandwidth clients across a narrow or wide set of data. This smooths out the I/O load on the servers in a near-perfect manner regardless of the number of clients performing I/O. It is absolutely true that metadata serving can become a bottleneck, so parallel file systems use cached and/or distributed metadata to overcome this and again, every client takes part in that interaction and shares some responsibility for managing communicating metadata updates.

Q: Can any application access parallel file system (i.e. through an agent in the driver level)? Or does it require specific code within the application?

A: Native access to a parallel file system requires a specific client or agent in the host, but many parallel file systems allow any client to access the data through a NAS protocol gateway. No changes are needed to applications to use a parallel file system – These parallel file systems are mounted as a POSIX compliant file system and therefore adhere to basically the same standards as an NFS mount for example.

Q: Are parallel file system clients compatible with scale-out NAS servers?

A: Nearly all scale-out NAS servers speak a standard NAS protocol like NFS or SMB. Clients running a parallel file system client can also access NAS via these standard protocols. Exceptions to this may possibly (but none that we know of) occur for scale-out NAS servers that support a modified NFS/SMB protocol or a custom NAS client which might conceivably conflict with the parallel file system client when installed on an OS.

Q: Of course I am biased, but I am fond of the AFS (Andrew File System) Family of File Systems. There is OpenAFS, but there is also what we are doing at AuriStor extending beyond the core AFS global namespace model (security functionality, and performance)

A: AFS is another distributed file system which supports large scale deployments, native clients for many platforms, and strong security features. It also uses local caching of files to improve performance. It uses a weakly consistent file locking system so multiple clients can access the same file simultaneously but they cannot both update the same file at the same time. OpenAFS is an open-source implementation of AFS. Auristor (formerly Your File System, Inc.) is a startup providing a commercial parallel file system that is compatible with AFS.

Q: I am more familiar with Veritas Cluster File System, could you please do a quick compare with Lustre or GPFS?

A: The Veritas Cluster File System (formerly VxCFS, now part of Veritas InfoScale) is a distributed file system that runs on Linux and popular flavors of Unix. It supports up to 64 nodes and allows multiple nodes to share the same back-end storage hardware. Comparing it to Lustre and GPFS is beyond the scope of this webinar, but in basic terms, parallel file systems can offer far greater scalability and bandwidth for example, through the use of optimized RDMA clients for high performance networks.

Q: Why do file apps need shared access to data, but block apps do not?

A: Traditionally block storage did not offer shared access to data (except when used as shared back-end storage for a clustered file system), while apps that needed shared access to data usually chose to use a NAS protocol such as SMB or NFS. So in many cases file-based apps use file sharing protocols because they need shared access to data from multiple clients. (In other cases file-based applications do not require sharing but the storage administrators believe it’s easier to manage or less expensive than networked block storage.)

Q: Do Lustre and GPFS have SMB Direct support?

A: Not today. SMB Direct is an option to use RDMA and multi-channel with the SMB 3 protocol. Both Lustre and GPFS support the ability to export a file system via NFS or SMB, but generally they do not support SMB Direct yet. Both Lustre and GPFS support RDMA access through their clients.

How to the clients avoid doing simultaneous writes to the same file?

A: Some parallel file systems allow this by letting different clients write to different parts of the same file. Others do not allow this. In either case, distributed file locking is used to prevent two clients from writing simultaneously to the same part of a file (or to the same file if it’s not allowed).

Q: How can you say that the application “does not have to worry about” how the clustered file system serializes writes? Doesn’t this require continuous end-to-end connectivity?

A: When the application writes data it generally writes to a POSIX-compliant file system and does not need to worry about how the parallel file system serializes, distributes, or protects the data because this is virtualized (managed) by the file system. It usually does require continuous end-to-end connectivity from the clients to the servers, though in some cases caching could allow for brief gaps in connectivity and in some systems not every client needs to have network connectivity to every server. There are multiple mechanisms within parallel file systems to manage the various cases of clients/servers disappearing from the network, temporarily or permanently (whilst for example holding a lock).

Q: How does a parallel file system handle the sequences of write on a same file? Just append one by one? What if a client modified a line?

A: This is the biggest challenge for and reason to use a parallel file system. Beneath the covers, coherency is maintained by Spectrum Scale using a token management server process which issues locks for object requests. Similar functionality is implemented in Lustre using a distributed lock manager. These objects are most commonly blocks within files rather than entire files, but this is application controlled. The end result is a POSIX-compliant interface that scales to thousands of clients.

Q: What does FPO stand for?

A: File Placement Optimizer – a shared-nothing architecture and licensing model for IBM Spectrum Scale (aka GPFS). Learn more here.

Q: Is there a concept in parallel file systems for “auto-tuning” yet? Seems like the early days of SAN management and tuning…

A: Default tuning values are optimized for general purpose workloads, but the whole purpose of tuning parameters is to adjust away from those defaults to optimize the file system for a particular application workload or fil esystem architecture. Both IBM and OpenSFS with the support of Intel have published extensive documentation on best practices for optimization and tuning for either file system. We are not aware of any work on “automating” that process but there has been recent work (e.g. in spectrum scale) to simplify the tuning process.

Q: Which is better as interconnect between disk and servers, shared access or share-nothing?

A: The use of shared access in the interconnect between disks and servers is limited to providing HA functionality in Lustre or Spectrum Scale, the ability to service I/O requests to a storage device if the server which has primary responsibility for that device is not available. This usually involves multiple server-attached external storage which can add cost to building the file system. The alternative approach to HA is to replicate blocks of data to different disks on different servers, cutting back on the usable capacity of the file system. If HA is not a requirement, a share-nothing architecture will generally involve less hardware and therefore be less expensive to build.

If you have more questions, please comment on this blog. And I encourage you to check out the SNIA ESF webcast library for educational, vendor-neutral content on Ethernet networked storage topics

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Clustered Filesystems Ethernet Data Storage Lustre NFS pNFS RDMA smb

Blog

SNIA Storage Developer Conference-The Knowledge Continues

SNIA Storage Developer Conference-The Knowledge Continues

khauser

Oct 13, 2016

SNIA's 18th Storage Developer Conference is officially a success, with 124 general and breakout sessions; Cloud Interoperability, Kineti plugfest 5

c Storage, and SMB3 plugfests; ten Birds-of-a-Feather Sessions, and amazing networking among 450+ attendees. Sessions on NVMe over Fabrics won the title of most attended, but Persistent Memory, Object Storage, and Performance were right behind. Many thanks to SDC 2016 Sponsors, who engaged attendees in exciting technology discussions. For those not familiar with SDC, this technical industry event is designed for a variety of storage technologists at various levels from developers to architects to product managers and more. And, true to SNIA's commitment to educating the industry on current and future disruptive technologies, SDC content is now available to all - whether you attended or not - for download and viewing. 20160919_120059

You'll want to stream keynotes from Citigroup, Toshiba, DSSD, Los Alamos National Labs, Broadcom, Microsemi, and Intel - they're available now on demand on SNIA's YouTube channel, SNIAVideo. All SDC presentations are now available for download; and over the next few months, you can continue to download SDC podcasts which combine audio and slides. The first podcast from SDC 2016 - on hyperscaler (as well as all 2015 SDC Podcasts) are available here, and more will be available in the coming weeks. SNIA thanks all its members and colleagues who contributed to make SDC a success! A special thanks goes out to the SNIA Technical Council, a select group of acknowledged industry experts who work to guide SNIA technical efforts. In addition to driving the agenda and content for SDC, the Technical Council oversees and manages SNIA Technical Work Groups, reviews architectures submitted by Work Groups, and is the SNIA's technical liaison to standards organizations. Learn more about these visionary leaders at http://www.snia.org/about/organization/tech_council. And finally, don't forget to mark your calendars now for SDC 2017 - September 11-14, 2017, again at the Hyatt Regency Santa Clara. Watch for the Call for Presentations to open in February 2017.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Block Storage Brighttalk Case Studies CDMI client SSD Cloud APIs Cloud Data Management Interface Cloud Standards Cloud Storage Companies & Products converged storage data center Data Protection Ethernet Storage Fibre Channel Fibre Channel over Ethernet file storage Flash High Performance Computing Hybrid Cloud Interoperability IP storage key management Linear Tape File System LTFS Networked Storage Networking New & Events NMVe over Fabrics Non-Volatile Memory NVDIMM NVM NVM Programming NVMe NVMe over Fabrics Object Drives Object Storage Open Systems OpenSource OpenStack PCIe SSD performance Performance Benchmarks Persistent Memory Plugfest RDMA Scale-Out Storage SDC security server virtualization SMB 3 SMB3 software defined network software defined storage Solid State Solid State Storage SSD Standards Storage Storage Architecture Storage arrays Storage Developer Conference Storage Education storage over Ethernet Testing & Performance Training & Seminars Use Cases video virtualization Webcasts

Blog

Find out How iSCSI is Evolving

Find out How iSCSI is Evolving

David Fair

May 4, 2016

The next Ethernet Storage Forum Webcast. “Evolution of iSCSI including iSER, iSCSI over RDMA Ethernet,” will focus on developments with iSCSI – the Internet Protocol standard for transferring SCSI commands across an Ethernet network, enabling hosts to link to storage devices wherever they may be. At this Webcast on May 24th, I will be joined by Fred Knight, Standards Technologist at NetApp, and Andy Banta, Storage Janitor at SolidFire/NetApp, who will discuss the evolution of iSCSI up to iSER, which takes advantage of Ethernet RDMA fabric technologies to enhance performance. Register now to hear:

A brief history of iSCSI
How iSCSI works
IETF refinements to the specification
Enhancing iSCSI performance with iSER

The Webcast will be live, so please bring your questions for Andy and Fred. We hope to see you there!

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

ethernet Ethernet Data Storage iSCSI iSER RDMA

Blog

Find out How iSCSI is Evolving

Find out How iSCSI is Evolving

David Fair

May 4, 2016

The next Ethernet Storage Forum Webcast. "Evolution of iSCSI including iSER, iSCSI over RDMA Ethernet," will focus on developments with iSCSI - the Internet Protocol standard for transferring SCSI commands across an Ethernet network, enabling hosts to link to storage devices wherever they may be. At this Webcast on May 24th, I will be joined by Fred Knight, Standards Technologist at NetApp, and Andy Banta, Storage Janitor at SolidFire/NetApp, who will discuss the evolution of iSCSI up to iSER, which takes advantage of Ethernet RDMA fabric technologies to enhance performance. Register now to hear:

A brief history of iSCSI
How iSCSI works
IETF refinements to the specification
Enhancing iSCSI performance with iSER

The Webcast will be live, so please bring your questions for Andy and Fred. We hope to see you there! Update: If you missed the live event, it's now available on-demand. You can also download the webcast slides.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Ethernet Data Storage iSCSI RDMA

Blog

Ethernet RDMA Protocols Support for NVMe over Fabrics – Your Questions Answered

Ethernet RDMA Protocols Support for NVMe over Fabrics – Your Questions Answered

David Fair

Mar 21, 2016

Our recent SNIA Ethernet Storage Forum Webcast on How Ethernet RDMA Protocols iWARP and RocE Support NVMe over Fabrics generated a lot of great questions. We didn’t have time to get to all of them during the live event, so as promised here are the answers. If you have additional questions, please comment on this blog and we’ll get back to you as soon as we can.

Q. Are there still actual (memory based) submission and completion queues, or are they just facades in front of the capsule transport?

A. On the host side, they’re “facades” as you call them. When running NVMe/F, host reads and writes do not actually use NVMe submission and completion queues. That data just comes from and to RNIC RDMA queues. On the target side, there could be real NVMe submissions and completion queues in play. But the more accurate answer is that it is “implementation dependent.”

Q. Who places the command from NVMe queue to host RDMA queue from software standpoint?

A. This is managed by the kernel host software in code written to the NVMe/F specification. The idea is that any existing application that thinks it is writing to the existing NVMe host software will in fact cause the SQE entry to be encapsulated and placed in an RDMA send queue.

Q. You say “most enterprise switches” support NVMe/F over RDMA, I guess those are ‘new’ ones, so what is the exact question to ask a vendor about support in an older switch?

A. For iWARP, any switch that can handle Internet traffic will do. Mellanox and Intel have different answers for RoCE / RoCEv2. Mellanox says that for RoCE, it is recommended, but not required, that the switch support Priority Flow Control (PFC). Most new enterprise switches support PFC, but you should check with your switch vendor to be sure. Intel believes RoCE was architected around DCB. The name itself, RoCE, stands for “RDMA over Converged Ethernet,” i.e., Ethernet with DCB. Intel believes RoCE in general will require PFC (or some future standard that delivers equivalent capabilities) for efficient RDMA over Ethernet.

Q. Can you comment on when one should use RoCEv2 vs. iWARP?

A. We gave a high-level overview of some of the deployment considerations on slide 30. We refer you to some of the vendor links on slide 32 for “non-vendor neutral” perspectives.

Q. If you take RDMA out of equation, what is the key advantage of NVMe/F over other protocols? Is it that they are transparent to any application?

A. NVMe/F allows the application to bypass the SCSI stack and uses native NVMe commands across a network. Most other block storage protocols require using the SCSI protocol layer, translating the NVMe commands into SCSI commands. With NVMe/F you also gain parallelism, simplicity of the command set, a separation between administrative sessions and data sessions, and a reduction of latency and processing required for NVMe I/O operations.

Q. Is ROCE v1 compatible with ROCE v2?

A. Yes. Adapters speaking RoCEv2 can also maintain RDMA connections with adapters speaking RoCEv1 because RoCEv2 ports are backwards interoperable with RoCEv1. Most of the currently shipping NICs supporting RoCE support both RoCEv1 and RoCEv2.

Q. Are RoCE and iWARP the only way to use Ethernet as a fabric for NMVe/F?

A. Initially yes; only iWARP and RoCE are supported for NVMe over Ethernet. But the NVM Express Working Group is also targeting FCoE. We should have probably been clearer about that, though it is noted on slide 11.

Q. What about doing NVMe over Fibre Channel? Is anyone looking at, or doing this?

A. Yes. This is not in scope for the first spec release, but the NVMe WG is collaborating with the FCIA on this. So NVMe over Fibre Channel is expected as another standard in the near future, to be promoted by T11.

Q. Do RoCE and iWARP both use just IP addresses for management or is there a higher level addressing mechanism, and management?

A. RoCEv2 uses the RoCE Connection Manager, and iWARP uses TCP connection management. They both use IP for addressing.

Q. Are there other fabrics to run NVMe over fabrics? Can you do this over OmniPath or Infiniband?

A. InfiniBand is in scope for the first spec release. Also, there is a related effort by the FCIA to support NVMe over Fibre Channel in a standard that will be promoted by T11.

Q. You indicated NVMe stack is in kernel while RDMA is a user level verb. How are NVMe SQ/ CQ entries transferred from NVMe to RDMA and vice versa? Also, could smaller transfers in NVMe (e.g. SGL of 512B) combined to larger sizes before being sent to RDMA entries and vice versa?

A. NVMe/F supports multiple scatter gather entries to combine multiple incontinuous transfers, nevertheless, the protocol doesn’t support chaining multiple NVMe commands on the same command capsule. A command capsule contains only a single NVMe command. Please also refer to slide 18 from the presentation.

Q. 1) How do implementers and adopters today test NVMe deployments? 2) Besides latency, what other key performance indicators do implements and adopters look for to determine whether the NVMe deployment is performing well or not?

A. 1) Like any other datacenter specification, testing is done by debugging, interop testing and plugfests. Local NVMe is well supported and can be tested by anyone. NVMe/F can be tested using pre-standard drivers or solutions from various vendors. UNH-IOH is an organization with an excellent reputation for helping here. 2) Latency, yes. But also sustained bandwidth, IOPS, and CPU utilization, i.e., the “usual suspects.”

Q. If RoCE CM supports ECN, why can’t it be used to implement a full solution without requiring PFC?

A. Explicit Congestion Notification (ECN) is an extension to TCP/IP defined by the IETF. First point is that it is a standard for congestion notification, not congestion management. Second point is that it operates at L3/L4. It does nothing to help make the L2 subnet “lossless.” Intel and Mellanox agree that generally speaking, all RDMA protocols perform better in a “lossless,” engineered fabric utilizing PFC (or some future standard that delivers equivalent capabilities). Mellanox believes PFC is recommended but not strictly required for RoCE, so RoCE can be deployed with PFC, ECN, or both. In contrast, Intel believes that for RoCE / RoCEv2 to deliver the “lossless” performance users expect from an RDMA fabric, PFC is in general required.

Q. How involved are Ethernet RDMA efforts with the SDN/OCP community? Is there a coming example of RoCE or iWarp on an SDN switch?

A. Good question, but neither RoCEv2 nor iWARP look any different to switch hardware than any other Ethernet packets. So they’d both work with any SDN switch. On the other hand, it should be possible to use SDN to provide special treatment with respect to say congestion management for RDMA packets. Regarding the Open Compute Project (OCP), there are various Ethernet NICs and switches available in OCP form factors.

Q. Is there a RoCE v3?

A. No. There is no RoCEv3.

Q. iWARP and RoCE both fall back to TCP/IP in the lowest communication sense? So they are somewhat compatible?

A. They can speak sockets to each other. In that sense they are compatible. However, for the usage model we’re considering here, NVMe/F, RDMA is required. Because of L3/L4 differences, RoCE and iWARP RNICs cannot speak RDMA to each other.

Q. So in case of RDMA (ROCE or iWARP), the NVMe controller’s fabric port is Ethernet?

A. Correct. But it must be RDMA-enabled Ethernet.

Q. What if I am using soft RoCE, do I still need an RNIC?

A. Functionally, soft RoCE or soft iWARP should work on a regular NIC. Whether the performance is sufficient to keep up with NVMe SSDs without the hardware offloads is a different matter.

Q. How would the NVMe controller know that a command is placed in the submission queue by the Fabric host driver? Is the fabric host driver responsible for notifying the NVMe controller through remote doorbell trigger or the Fabric target driver should trigger the doorbell?

A. No separate notification by the host required. The fabric’s host driver simply sends a command capsule to notify its companion subsystem driver that there is a new command to be processed. The way that the subsystem side notifies the backend NVMe drive is out of the scope of the protocol.

Q. I am chair of ETSI NFV working group on NFV acceleration. We are working on virtual RDMA and how VM can benefit from hardware independent RDMA. One corner stone of this is virtual-RDMA pseudo device. But there is not yet consensus on minimal set of verbs to be supported: Do you think this minimal verb set can be identified? Last, the transport address space is not consistent between IB, Ethernet. How supporting transport independent RDMA?

A. You know, the NVM Express Working Group is working on exactly these questions. They have to define a “minimal verb set” since NVMe/F generates the verbs. Similarly, I’d suggest looking to the spec to see how they resolve the transport address space differences.

Q. What’s the plan for Linux submission of NVMe over Fabric changes? What releases are being targeted?

A. The Linux Driver WG in the NVMe WG expects to submit code upstream within a quarter of the spec being finalized. At this time it looks like the most likely Linux target will be kernel 4.6, but it could end up being kernel 4.7.

Q. Are NVMe SQ/CQ transferred transparently to RDMA Queues or can they be modified?

A. The method defined in the NVMe/F specification entails a transparent transfer. If you wanted to modify an SQE or CQE, do so before initiating an NVMe/F operation.

Q. How common are rNICs for recent servers? i.e. What’s a quick check I can perform to find out if my NIC is an rNIC?

A. rNICs are offered by nearly all major server vendors. The best way to check is to ask your server or NIC vendor if your NIC supports iWARP or RoCE.

Q. This is most likely out of the scope of this talk but could you perhaps share about 30K level on the differences between “NVMe controller” hardware versus “NVMeF” hardware. It’s most likely a combination of R-NIC+NVMe controller, but would be great to get your take on this.

A goal of the NVMe/F spec is that it work with all existing NVMe controllers and all existing RoCE and iWARP RNICs. So on even a very low level, we can say “no difference.” That said, of course, nothing stops someone from combining NVMe controller and rNIC hardware into one solution.

Q. Are there any example Linux targets in the distros that exercise RDMA verbs? An iWARP or iSER target in a distro?

A. iSER allows this using a LIO or TGT SCSI target.

Q. Is there a standard or IP for RDMA NIC?

A. The various RNICs are based on IBTA, IETF, and IEEE standards are shown on slide 26.

Q. What is the typical additional latency introduced comparing NVMe over Fabric vs. local NVMe?

A. In the 2014 IDF demo, the prototype NVMe/F stack matched the bandwidth of local NVMe with a latency penalty of only 8µs over a local iWARP connection. Other demonstrations have shown an added fabric latency of 3µs to 15µs. The goal for the final spec is under 10µs.

Q. How well is NVME over RDMA supported for Windows ?

A. It is not currently supported, but then the spec isn’t even finished. Contract Microsoft if you are interested in their plans.

Q. RDMA over Ethernet would not support Layer 2 switching? How do you deal with TCP over head?

A. L2 switching is supported by both iWARP and RoCE. Both flavors of RNICs have MAC addresses, etc. iWARP had to deal with TCP/IP in hardware, a TCP/IP Offload Engine or TOE. The TOE used in an iWARP RNIC is significantly constrained compared to a general purpose TOE and therefore can operate with very high performance. See the Chelsio website for proof points. RoCE does not use TCP so does not need to deal with TCP overhead.

Q. Does RDMA not work with fibre channel?

A. They are totally different Transports (L4) and Networks (L3). That said, the FCIA is working with NVMe, Inc. on supporting NVMe over Fibre Channel in a standard to be promoted by T11.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Ethernet Data Storage iSCSI iWARP RDMA

Blog

Ethernet RDMA Protocols Support for NVMe over Fabrics – Your Questions Answered

Ethernet RDMA Protocols Support for NVMe over Fabrics – Your Questions Answered

David Fair and John Kim

Mar 21, 2016

Our recent SNIA Ethernet Storage Forum Webcast on How Ethernet RDMA Protocols iWARP and RocE Support NVMe over Fabrics generated a lot of great questions. We didn't have time to get to all of them during the live event, so as promised here are the answers. If you have additional questions, please comment on this blog and we'll get back to you as soon as we can. Q. Are there still actual (memory based) submission and completion queues, or are they just facades in front of the capsule transport? A. On the host side, they're "facades" as you call them. When running NVMe/F, host reads and writes do not actually use NVMe submission and completion queues. That data just comes from and to RNIC RDMA queues. On the target side, there could be real NVMe submissions and completion queues in play. But the more accurate answer is that it is "implementation dependent." Q. Who places the command from NVMe queue to host RDMA queue from software standpoint? A. This is managed by the kernel host software in code written to the NVMe/F specification. The idea is that any existing application that thinks it is writing to the existing NVMe host software will in fact cause the SQE entry to be encapsulated and placed in an RDMA send queue. Q. You say "most enterprise switches" support NVMe/F over RDMA, I guess those are 'new' ones, so what is the exact question to ask a vendor about support in an older switch? A. For iWARP, any switch that can handle Internet traffic will do. Mellanox and Intel have different answers for RoCE / RoCEv2. Mellanox says that for RoCE, it is recommended, but not required, that the switch support Priority Flow Control (PFC). Most new enterprise switches support PFC, but you should check with your switch vendor to be sure. Intel believes RoCE was architected around DCB. The name itself, RoCE, stands for "RDMA over Converged Ethernet," i.e., Ethernet with DCB. Intel believes RoCE in general will require PFC (or some future standard that delivers equivalent capabilities) for efficient RDMA over Ethernet. Q. Can you comment on when one should use RoCEv2 vs. iWARP? A. We gave a high-level overview of some of the deployment considerations on slide 30. We refer you to some of the vendor links on slide 32 for "non-vendor neutral" perspectives. Q. If you take RDMA out of equation, what is the key advantage of NVMe/F over other protocols? Is it that they are transparent to any application? A. NVMe/F allows the application to bypass the SCSI stack and uses native NVMe commands across a network. Most other block storage protocols require using the SCSI protocol layer, translating the NVMe commands into SCSI commands. With NVMe/F you also gain parallelism, simplicity of the command set, a separation between administrative sessions and data sessions, and a reduction of latency and processing required for NVMe I/O operations. Q. Is ROCE v1 compatible with ROCE v2? A. Yes. Adapters speaking RoCEv2 can also maintain RDMA connections with adapters speaking RoCEv1 because RoCEv2 ports are backwards interoperable with RoCEv1. Most of the currently shipping NICs supporting RoCE support both RoCEv1 and RoCEv2. Q. Are RoCE and iWARP the only way to use Ethernet as a fabric for NMVe/F? A. Initially yes; only iWARP and RoCE are supported for NVMe over Ethernet. But the NVM Express Working Group is also targeting FCoE. We should have probably been clearer about that, though it is noted on slide 11. Q. What about doing NVMe over Fibre Channel? Is anyone looking at, or doing this? A. Yes. This is not in scope for the first spec release, but the NVMe WG is collaborating with the FCIA on this. So NVMe over Fibre Channel is expected as another standard in the near future, to be promoted by T11. Q. Do RoCE and iWARP both use just IP addresses for management or is there a higher level addressing mechanism, and management? A. RoCEv2 uses the RoCE Connection Manager, and iWARP uses TCP connection management. They both use IP for addressing. Q. Are there other fabrics to run NVMe over fabrics? Can you do this over OmniPath or Infiniband? A. InfiniBand is in scope for the first spec release. Also, there is a related effort by the FCIA to support NVMe over Fibre Channel in a standard that will be promoted by T11. Q. You indicated NVMe stack is in kernel while RDMA is a user level verb. How are NVMe SQ/ CQ entries transferred from NVMe to RDMA and vice versa? Also, could smaller transfers in NVMe (e.g. SGL of 512B) combined to larger sizes before being sent to RDMA entries and vice versa? A. NVMe/F supports multiple scatter gather entries to combine multiple incontinuous transfers, nevertheless, the protocol doesn't support chaining multiple NVMe commands on the same command capsule. A command capsule contains only a single NVMe command. Please also refer to slide 18 from the presentation. Q. 1) How do implementers and adopters today test NVMe deployments? 2) Besides latency, what other key performance indicators do implements and adopters look for to determine whether the NVMe deployment is performing well or not? A. 1) Like any other datacenter specification, testing is done by debugging, interop testing and plugfests. Local NVMe is well supported and can be tested by anyone. NVMe/F can be tested using pre-standard drivers or solutions from various vendors. UNH-IOH is an organization with an excellent reputation for helping here. 2) Latency, yes. But also sustained bandwidth, IOPS, and CPU utilization, i.e., the "usual suspects." Q. If RoCE CM supports ECN, why can't it be used to implement a full solution without requiring PFC? A. Explicit Congestion Notification (ECN) is an extension to TCP/IP defined by the IETF. First point is that it is a standard for congestion notification, not congestion management. Second point is that it operates at L3/L4. It does nothing to help make the L2 subnet "lossless." Intel and Mellanox agree that generally speaking, all RDMA protocols perform better in a "lossless," engineered fabric utilizing PFC (or some future standard that delivers equivalent capabilities). Mellanox believes PFC is recommended but not strictly required for RoCE, so RoCE can be deployed with PFC, ECN, or both. In contrast, Intel believes that for RoCE / RoCEv2 to deliver the "lossless" performance users expect from an RDMA fabric, PFC is in general required. Q. How involved are Ethernet RDMA efforts with the SDN/OCP community? Is there a coming example of RoCE or iWarp on an SDN switch? A. Good question, but neither RoCEv2 nor iWARP look any different to switch hardware than any other Ethernet packets. So they'd both work with any SDN switch. On the other hand, it should be possible to use SDN to provide special treatment with respect to say congestion management for RDMA packets. Regarding the Open Compute Project (OCP), there are various Ethernet NICs and switches available in OCP form factors. Q. Is there a RoCE v3? A. No. There is no RoCEv3. Q. iWARP and RoCE both fall back to TCP/IP in the lowest communication sense? So they are somewhat compatible? A. They can speak sockets to each other. In that sense they are compatible. However, for the usage model we're considering here, NVMe/F, RDMA is required. Because of L3/L4 differences, RoCE and iWARP RNICs cannot speak RDMA to each other. Q. So in case of RDMA (ROCE or iWARP), the NVMe controller's fabric port is Ethernet? A. Correct. But it must be RDMA-enabled Ethernet. Q. What if I am using soft RoCE, do I still need an RNIC? A. Functionally, soft RoCE or soft iWARP should work on a regular NIC. Whether the performance is sufficient to keep up with NVMe SSDs without the hardware offloads is a different matter. Q. How would the NVMe controller know that a command is placed in the submission queue by the Fabric host driver? Is the fabric host driver responsible for notifying the NVMe controller through remote doorbell trigger or the Fabric target driver should trigger the doorbell? A. No separate notification by the host required. The fabric's host driver simply sends a command capsule to notify its companion subsystem driver that there is a new command to be processed. The way that the subsystem side notifies the backend NVMe drive is out of the scope of the protocol. Q. I am chair of ETSI NFV working group on NFV acceleration. We are working on virtual RDMA and how VM can benefit from hardware independent RDMA. One corner stone of this is virtual-RDMA pseudo device. But there is not yet consensus on minimal set of verbs to be supported: Do you think this minimal verb set can be identified? Last, the transport address space is not consistent between IB, Ethernet. How supporting transport independent RDMA? A. You know, the NVM Express Working Group is working on exactly these questions. They have to define a "minimal verb set" since NVMe/F generates the verbs. Similarly, I'd suggest looking to the spec to see how they resolve the transport address space differences. Q. What's the plan for Linux submission of NVMe over Fabric changes? What releases are being targeted? A. The Linux Driver WG in the NVMe WG expects to submit code upstream within a quarter of the spec being finalized. At this time it looks like the most likely Linux target will be kernel 4.6, but it could end up being kernel 4.7. Q. Are NVMe SQ/CQ transferred transparently to RDMA Queues or can they be modified? A. The method defined in the NVMe/F specification entails a transparent transfer. If you wanted to modify an SQE or CQE, do so before initiating an NVMe/F operation. Q. How common are rNICs for recent servers? i.e. What's a quick check I can perform to find out if my NIC is an rNIC? A. rNICs are offered by nearly all major server vendors. The best way to check is to ask your server or NIC vendor if your NIC supports iWARP or RoCE. Q. This is most likely out of the scope of this talk but could you perhaps share about 30K level on the differences between "NVMe controller" hardware versus "NVMeF" hardware. It's most likely a combination of R-NIC+NVMe controller, but would be great to get your take on this. A goal of the NVMe/F spec is that it work with all existing NVMe controllers and all existing RoCE and iWARP RNICs. So on even a very low level, we can say "no difference." That said, of course, nothing stops someone from combining NVMe controller and rNIC hardware into one solution. Q. Are there any example Linux targets in the distros that exercise RDMA verbs? An iWARP or iSER target in a distro? A. iSER allows this using a LIO or TGT SCSI target. Q. Is there a standard or IP for RDMA NIC? A. The various RNICs are based on IBTA, IETF, and IEEE standards are shown on slide 26. Q. What is the typical additional latency introduced comparing NVMe over Fabric vs. local NVMe? A. In the 2014 IDF demo, the prototype NVMe/F stack matched the bandwidth of local NVMe with a latency penalty of only 8 µs over a local iWARP connection. Other demonstrations have shown an added fabric latency of 3 µs to 15 µs. The goal for the final spec is under 10 µs. Q. How well is NVME over RDMA supported for Windows ? A. It is not currently supported, but then the spec isn't even finished. Contract Microsoft if you are interested in their plans. Q. RDMA over Ethernet would not support Layer 2 switching? How do you deal with TCP over head? A. L2 switching is supported by both iWARP and RoCE. Both flavors of RNICs have MAC addresses, etc. iWARP had to deal with TCP/IP in hardware, a TCP/IP Offload Engine or TOE. The TOE used in an iWARP RNIC is significantly constrained compared to a general purpose TOE and therefore can operate with very high performance. See the Chelsio website for proof points. RoCE does not use TCP so does not need to deal with TCP overhead. Q. Does RDMA not work with fibre channel? A. They are totally different Transports (L4) and Networks (L3). That said, the FCIA is working with NVMe, Inc. on supporting NVMe over Fibre Channel in a standard to be promoted by T11. Update: If you missed the live event, it's now available on-demand. You can also download the webcast slides.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Ethernet Data Storage RDMA

Blog

How Ethernet RDMA Protocols iWARP and RoCE Support NVMe over Fabrics

How Ethernet RDMA Protocols iWARP and RoCE Support NVMe over Fabrics

David Fair

Jan 6, 2016

Many customers and vendors are now familiar with the advantages and concepts of NVMe over Fabrics but are not familiar with the specific protocols that support it. Join us on January 26^th for this live Webcast that will explore and compare the Ethernet RDMA protocols and transports that support NVMe over Fabrics and the infrastructure needed to use them. You’ll hear:

Why NVMe Over Fabrics requires a low-latency network
How the NVMe protocol is mapped to the network transport
How RDMA-capable protocols work
Comparing available Ethernet RDMA transports: iWARP and RoCE
Infrastructure required to support RDMA over Ethernet
Congestion management methods

The event is live, so please bring your questions. We look forward to answering them.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Ethernet Data Storage Ethernet Storage File Protocols SIG iWARP NVMe NVMe over Fabrics RDMA RoCE

Blog

How Ethernet RDMA Protocols iWARP and RoCE Support NVMe over Fabrics

How Ethernet RDMA Protocols iWARP and RoCE Support NVMe over Fabrics

David Fair

Jan 5, 2016

NVMe (Non-Volatile Memory Express) over Fabrics is of tremendous interest among storage vendors, flash manufacturers, and cloud and Web 2.0 customers. Because it offers efficient remote and shared access to a new generation of flash and other non-volatile memory storage, it requires fast, low latency networks, and the first version of the specification is expected to take advantage of RDMA (Remote Direct Memory Access) support in the transport protocol. Many customers and vendors are now familiar with the advantages and concepts of NVMe over Fabrics but are not familiar with the specific protocols that support it. Join us on January 26^th for this live Webcast that will explore and compare the Ethernet RDMA protocols and transports that support NVMe over Fabrics and the infrastructure needed to use them. You'll hear:

Why NVMe Over Fabrics requires a low-latency network
How the NVMe protocol is mapped to the network transport
How RDMA-capable protocols work
Comparing available Ethernet RDMA transports: iWARP and RoCE
Infrastructure required to support RDMA over Ethernet
Congestion management methods

The event is live, so please bring your questions. We look forward to answering them. Update: If you missed the live event, it's now available on-demand. You can also download the webcast slides.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Ethernet Data Storage File Protocols SIG NVMe RDMA

Blog

Ethernet Roadmap for Networked Storage Q&A

Ethernet Roadmap for Networked Storage Q&A

David Fair

Jul 17, 2015

Almost 200 people attended our joint Webcast with the Ethernet Alliance: “The 2015 Ethernet Roadmap for Networked Storage.” We had a lot of great questions during the live event, but we did not have time to answer them all. As promised, we’ve complied answers for all of the questions that came in. If you think of additional questions, please feel free to comment on this blog.

Q. What did you mean by parity of flash with HDD?

A. We were referring to the O’Reilly article in “Network Computing.” O’Reilly is predicting parity in BOTH capacity and price in 2016.

Q. When do we expect IEEE standards ratification for 25G speed?

A. 2016. You can see the exact schedule here.

Q. Do you envision the Enterprise, Cloud Providers, HPC, Financials getting rid of their 10/40GbE infrastructure and replacing that with 25/100GbE infrastructure in 2017? Will these customers deploy 100GbE/25GbE switch in the leaf layer in 2017?

A. Deployment will occur over a multi-year time span overall if only because switch infrastructure is expensive to upgrade, as reflected in the Crehan Research forecast. New deployments will likely move to 25/100GbE as new switches with 100GbE downstream ports become available in 2016. Just because the Cloud Service Providers are currently the most aggressive in driving new infrastructure purchases, they represent the largest early volumes for 25/100 GbE. Enterprise is still in the midst of the transition from 1GbE to 10GbE.

Q. What are some of the developments on spanning-tree derivatives vs. Dykstra based derivatives such as OSPF, FSPF for switches?

A. Beyond the scope of this presentation on Ethernet. Ethernet is defined by the IEEE for L1 and L2 in the ISO model. Your questions are at L3 and L4, which is handled by organizations like IETF.

Q. With all the speeds possible who is working on flow control?

A. Flow control at the 802.1 level is supported in the Layer 1/2 PHY & MAC by setting upper bounds on the delay through each layer which allows higher layers to comprehend the delays & response times to pause frames. Each new speed & PHY in 802.3 is accompanied by delay constraint specifications to support this.

Q. Do you have an overlay graphic that shows the Ethernet RDMA roadmap? If so, is Ethernet storage the primary driver for that technology?

A. Beyond the scope of this presentation on Ethernet. Ethernet is defined by the IEEE for L1 and L2 in the ISO model. Your questions are at L3 and L4, which is handled by organizations like IETF and the InfiniBand Trade Association.

Q. The adoption of faster and new Ethernet always has to do with the costs of acquiring new technology. How long do you think it will take to adopt/acquire faster Ethernet in datacenters now that the development is happening much faster than the last 20 years?

A. Please see the chart on slide 7 where Crehan Research predicts how fast the technology will diffuse into deployments.

Q. What do you expect as cost comparison between Ethernet and InfiniBand going forward?
Also, what work is being done to reduce latency?

A. Beyond the scope of this presentation. Latency is primarily a consequence of design methodologies and semiconductor process technology, and thus under the control of the silicon device manufacturers. Some vendors prioritize latency more than others.

Q. What’s the technical limitation as speeds go higher and higher?

A. A number of factors limit speeds going faster and faster, but the main problem is that materials attenuate signals as they travel at higher frequencies.

Q. Will 1GbE used for manageability purposes disappear from public cloud? If so, what is the expected time frame?

A. This is a choice for end users. Most equipment is managed on a separate network for security concerns, but users can eliminate these management networks at any time.

Q. What are the relative market size predictions for the expanding number of standards (25G, 50G, 100G, 200G, etc.)?

A. See the Crehan Research forecast in the presentation.

Q. What is the major difference between SMF & MMF for the not so initiated?

A. The SMF has a 9um core while the MMF has a 50um core. Different lasers are used for each fiber type and MMF typically goes 100 meters above 10GbE and SMF goes from 500m to 10km.

Q. Will 25G be available through both copper and fibre connectivity?

A. Yes. IEEE 802.3 work is currently underway to specify 25Gb/s on twinax (“direct attach copper)” to 5 meters, printed circuit backplane up to ~1m, twisted pair copper to 30m, multimode fiber to 100m. There is no technology barrier to 25G on SMF, just that a standards project to specify it has not started yet.

Q. This is interesting from a hardware viewpoint, but has nothing to do with storage yet. Are we going to get to how this relates to storage other than saying flash drives are fast and only Ethernet can keep up?

A. Beyond the scope of this presentation on Ethernet. Ethernet is defined by the IEEE for L1 and L2 in the ISO model. Your questions are directed at the higher layers. The key point of this webcast is that storage networking engineers need to pay much more attention to the Ethernet roadmap than they have historically, primarily because of NVM.

Q. How does “SFP 28″ fit in this mix? Is it required for 25G?

A. SFP28 connectors and modules are required for 25GbE because they give better performance than SFP+ that only works to 10GbE.

Q. Can you provide the quick difference between copper & optical on speed & costs?

A. Copper and optical Ethernet links are usually standardized at the same speed. 400GbE is not defining a copper link but an active Direct Attached Cable (DAC) will probably support 400GbE. Cost depends on volume and many factors and is beyond the scope of this presentation. Copper is usually a fraction of the cost of optical links.

Q. Do you think people will try to use multiple CAT 5e to get more aggregate bandwidth to the access points to avoid having to run Fibre to them?

A. IEEE is defining 2.5GBASE-T and 5GBASE-T to enable Cat5e to support faster wireless access points.

Q. When are higher speeds and PoE going to reach the point when copper based Ethernet will become a viable heat source for buildings thus helping the environment?

A. IEEE is defining 4 wire PoE to deliver at least 60W to end devices. You can find out more here.

Q. What are the use cases for 2.5Gb and 5.0Gb Base-T?

A. The leading use case for 2.5G/5GBASE-T is to provide the uplink for wireless LAN access points that support 802.11ac and future wireless technology. Wireless LAN technology has advanced to the point where >1Gb/s BW is needed upstream from the AP, and 2.5G/5G provide a higher speed uplink while preserving the user’s investment in Cat5e/Cat6 cabling.

Q. Why not have only CFP2 sockets right away with things disabled for lower speeds for all the intervening years leading to full-fledged CFP2?

A. CFP2 is defined for 100GbE and 8 ports can be used on a 1U switch. 100GbE switches are shifting to QSFP28 so that 32 ports of 100GbE is supported in a 1U switch at low cost. The CFP2 is much more expensive than QSFP28 and will not be used for lower speeds because of the high cost.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

ethernet Ethernet Alliance Ethernet Data Storage Ethernet Roadmap IEEE Networked Storage RDMA Storage over Ethernet SIG

Subscribe to RDMA

Common Questions on Clustered File Systems

Find a similar article by tags

Leave a Reply

Common Questions on Clustered File Systems

Find a similar article by tags

Leave a Reply

SNIA Storage Developer Conference-The Knowledge Continues

Find a similar article by tags

Leave a Reply

Find out How iSCSI is Evolving

Find a similar article by tags

Leave a Reply

Find out How iSCSI is Evolving

Find a similar article by tags

Leave a Reply

Ethernet RDMA Protocols Support for NVMe over Fabrics – Your Questions Answered

Find a similar article by tags

Leave a Reply

Ethernet RDMA Protocols Support for NVMe over Fabrics – Your Questions Answered

Find a similar article by tags

Leave a Reply

How Ethernet RDMA Protocols iWARP and RoCE Support NVMe over Fabrics

Find a similar article by tags

Leave a Reply

How Ethernet RDMA Protocols iWARP and RoCE Support NVMe over Fabrics

Find a similar article by tags

Leave a Reply

Ethernet Roadmap for Networked Storage Q&A

Find a similar article by tags

Leave a Reply