Sorry, you need to enable JavaScript to visit this website.

Revving Up Storage for Automotive

Tom Friend

Nov 8, 2021

title of post
Each year cars become smarter and more automated. In fact, the automotive industry is effectively transforming the vehicle into a data center on wheels. Connectedness, autonomous driving, and media & entertainment all bring more and more storage onboard and into networked data centers. But all the storage in (and for) a car is not created equal. There are 10s if not 100s of different processors on a car today. Some are attached to storage, some are not and each application demands different characteristics from the storage device. The SNIA Networking Storage Forum (NSF) is exploring this fascinating topic on December 7, 2021 at our live webcast “Revving Up Storage for Automotive” where industry experts from both the storage and automotive worlds will discuss:
  • What’s driving growth in automotive storage?
  • Special requirements for autonomous vehicles
  • Where automotive data is typically stored?
  • Special use cases
  • Vehicle networking & compute changes and challenges
Start your engines and register today to join us as we drive into the future!

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Keeping Pace with Object Storage Trends & Use Cases

Christine McMonigal

Nov 3, 2021

title of post

Object storage has been among the most popular topics we’ve covered in the SNIA Networking Storage Forum. On November 16, 2021, we will take this topic on again at our live webcast “Object Storage: Trends, Use Cases.” Moving beyond the mechanics of object storage, our experts panel will focus on recent object storage trends, problems object storage can solve, and real-world use cases including ransomware protection.

So, what’s new? Object storage has traditionally been seen as an archival storage platform, and is now being employed as a platform for primary data. In this webcast, we’ll highlight how this is happening and discuss:

  • Object storage characteristics
  • The differences and similarities between object and key value storage
  • Security options unique to object storage including ransomware mitigation
  • Why use object storage: Use cases and applications
  • Object storage and containers: Why Kubernetes’ COSI (Container Object Storage Interface)?

Register today. Our SNIA panel will be available to answer your object storage questions.

I also encourage you to check out these previous presentations we’ve done on object storage:

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Keeping Pace with Object Storage Trends & Use Cases

Christine McMonigal

Nov 3, 2021

title of post
Object storage has been among the most popular topics we’ve covered in the SNIA Networking Storage Forum. On November 16, 2021, we will take this topic on again at our live webcast “Object Storage: Trends, Use Cases.” Moving beyond the mechanics of object storage, our experts panel will focus on recent object storage trends, problems object storage can solve, and real-world use cases including ransomware protection. So, what’s new? Object storage has traditionally been seen as an archival storage platform, and is now being employed as a platform for primary data. In this webcast, we’ll highlight how this is happening and discuss:
  • Object storage characteristics
  • The differences and similarities between object and key value storage
  • Security options unique to object storage including ransomware mitigation
  • Why use object storage: Use cases and applications
  • Object storage and containers: Why Kubernetes’ COSI (Container Object Storage Interface)?
Register today. Our SNIA panel will be available to answer your object storage questions. I also encourage you to check out these previous presentations we’ve done on object storage:

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Fibre Channel SAN Hosts and Targets Q&A

John Kim

Oct 25, 2021

title of post

At our recent SNIA Networking Storage Forum (NSF) webcast “How Fibre Channel Hosts and Targets Really Communicate” our Fibre Channel (FC) experts explained exactly how Fibre Channel works, starting with the basics on the FC networking stack, link initialization, port types, and flow control, and then dove into the details on host/target logins and host/target IO. It was a great tutorial on Fibre Channel. If you missed it, you can view it on-demand. The audience asked several questions during the live event. Here are answers to them all:

Q. What is the most common problem that we face in the FC protocol?

A. Much the same as any other network protocol, congestion is the most common problem found in FC SANs. It can take a couple of forms including, but not limited to, host oversubscription and “Fan-in/Fan-out" ratios of host ports to storage ports, but it is probably the single largest generator of support cases. Another common problem is the 'Host cannot see target' kind of problem.  

Q. What are typical latencies for N-to-N (node-to-node) Port and N-F-N (one switch between)?

A. Latencies vary from switch type to switch type and also vary based on the type of forwarding that is done. Port to port on a switch, I would say is from 1us to 5us in general.

Q. Has the Fabric Shortest Path First (FSPF) always been there or is there a minimum FC speed at which it was introduced in? Also, how is the FSPF determined?  Is it via shortest path only or does it also take into account for speeds of the switches along the path?

A. While Fibre Channel has existed since 1993 at 133Mbit speed, FSPF was developed by the INCITS T11 Technical Committee and was published in 2000 as a cost-based Link State routing protocol. Costs are based on link speeds. The higher the link speed, the lower the cost. The cost is 1012 / Bandwidth(bps) = cost. There have been variations of implementations that allowed the network administrator to artificially set a link cost and force traffic into a path, but the better case is to simply allow FSPF to do its normal work. And yes, the link costs are considered for all of the intermediate devices along the path.

Q. All of this FSPF happens without even us noticing, right? Or do we need to manually configure?

A. Yes, all of the FSPF routing happens without any manual configuration. Most users don't even realize there is an underlying routing protocol. 

Q. Is it a best practice to have all ports in the system run at the same speed? We have storage connected at 32Gb interfaces and a hundred clients with 16Gb interfaces. Would this make the switch's job easier?

A. It's virtually impossible to have all ports of a FC SAN (or any network of size) connect at the same speed. In fact, the more common environment is for multiple versions of server and storage technology to have been “organically grown over time” in the datacenter. Even if that was somehow done, then there still can be congestion caused by hosts and targets requesting data from multiple simultaneous sources. So, having a uniform speed doesn't really fix anything even if it might make some things a bit better. That said, it is always helpful to make certain that your HBA device drivers and firmware versions are up to date.

Q. From your experience, is there any place where the IO has gone wrong?

A. Not sure what 'IO gone wrong' means. All frames that transverse the SAN are cyclic redundancy check (CRC) checked. That might happen on each hop or it might just happen at the end devices. But frames that are found to be corrupted should never be incorporated into the LUN.

Q. Is there a fabric notification feature for these backpressure events?

A. Yes, the recent standards have several mechanisms for notification. This is called 'Fabric Performance Impact Notifications (FPIN). It includes several things such as ELS (extended link service) notifications sent through software to identify congestion, link integrity and SCSI command delivery issues. In Gen 7/64Gb platforms it also includes an in-band hardware signal for credit stall and oversubscription conditions. Today both RHEL and AIX support the receipt of FPIN link integrity notifications and integrate it into their respective MPIO interfaces allowing them to load balance/avoid a “sick but not dead” link. Additional operating systems are on the way and the first of the array vendors to support this are expected “soonish.” While there is no "silver bullet" that solves every congestion problem, FPIN as a tool which engages the whole ecosystem instead of leaving the "switch in the middle" to interpret data on its own is a huge potential benefit.

Q. There is so much good information here. Are the slides available?

A. Yes, the session has been recorded and is available on-demand along with the slides at the SNIA Educational Library where you can also search for countless educational content storage.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Fibre Channel SAN Hosts and Targets Q&A

John Kim

Oct 25, 2021

title of post
At our recent SNIA Networking Storage Forum (NSF) webcast “How Fibre Channel Hosts and Targets Really Communicate” our Fibre Channel (FC) experts explained exactly how Fibre Channel works, starting with the basics on the FC networking stack, link initialization, port types, and flow control, and then dove into the details on host/target logins and host/target IO. It was a great tutorial on Fibre Channel. If you missed it, you can view it on-demand. The audience asked several questions during the live event. Here are answers to them all: Q. What is the most common problem that we face in the FC protocol? A. Much the same as any other network protocol, congestion is the most common problem found in FC SANs. It can take a couple of forms including, but not limited to, host oversubscription and “Fan-in/Fan-out” ratios of host ports to storage ports, but it is probably the single largest generator of support cases. Another common problem is the ‘Host cannot see target’ kind of problem. Q. What are typical latencies for N-to-N (node-to-node) Port and N-F-N (one switch between)? A. Latencies vary from switch type to switch type and also vary based on the type of forwarding that is done. Port to port on a switch, I would say is from 1us to 5us in general. Q. Has the Fabric Shortest Path First (FSPF) always been there or is there a minimum FC speed at which it was introduced in? Also, how is the FSPF determined?  Is it via shortest path only or does it also take into account for speeds of the switches along the path? A. While Fibre Channel has existed since 1993 at 133Mbit speed, FSPF was developed by the INCITS T11 Technical Committee and was published in 2000 as a cost-based Link State routing protocol. Costs are based on link speeds. The higher the link speed, the lower the cost. The cost is 1012 / Bandwidth(bps) = cost. There have been variations of implementations that allowed the network administrator to artificially set a link cost and force traffic into a path, but the better case is to simply allow FSPF to do its normal work. And yes, the link costs are considered for all of the intermediate devices along the path. Q. All of this FSPF happens without even us noticing, right? Or do we need to manually configure? A. Yes, all of the FSPF routing happens without any manual configuration. Most users don’t even realize there is an underlying routing protocol. Q. Is it a best practice to have all ports in the system run at the same speed? We have storage connected at 32Gb interfaces and a hundred clients with 16Gb interfaces. Would this make the switch’s job easier? A. It’s virtually impossible to have all ports of a FC SAN (or any network of size) connect at the same speed. In fact, the more common environment is for multiple versions of server and storage technology to have been “organically grown over time” in the datacenter. Even if that was somehow done, then there still can be congestion caused by hosts and targets requesting data from multiple simultaneous sources. So, having a uniform speed doesn’t really fix anything even if it might make some things a bit better. That said, it is always helpful to make certain that your HBA device drivers and firmware versions are up to date. Q. From your experience, is there any place where the IO has gone wrong? A. Not sure what ‘IO gone wrong’ means. All frames that transverse the SAN are cyclic redundancy check (CRC) checked. That might happen on each hop or it might just happen at the end devices. But frames that are found to be corrupted should never be incorporated into the LUN. Q. Is there a fabric notification feature for these backpressure events? A. Yes, the recent standards have several mechanisms for notification. This is called ‘Fabric Performance Impact Notifications (FPIN). It includes several things such as ELS (extended link service) notifications sent through software to identify congestion, link integrity and SCSI command delivery issues. In Gen 7/64Gb platforms it also includes an in-band hardware signal for credit stall and oversubscription conditions. Today both RHEL and AIX support the receipt of FPIN link integrity notifications and integrate it into their respective MPIO interfaces allowing them to load balance/avoid a “sick but not dead” link. Additional operating systems are on the way and the first of the array vendors to support this are expected “soonish.” While there is no “silver bullet” that solves every congestion problem, FPIN as a tool which engages the whole ecosystem instead of leaving the “switch in the middle” to interpret data on its own is a huge potential benefit. Q. There is so much good information here. Are the slides available? A. Yes, the session has been recorded and is available on-demand along with the slides at the SNIA Educational Library where you can also search for countless educational content storage.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Tom Friend

Oct 20, 2021

title of post

What types of storage are needed for different aspects of AI? That was one of the many topics covered in our SNIA Networking Storage Forum (NSF) webcast “Storage for AI Applications.” It was a fascinating discussion and I encourage you to check it out on-demand. Our panel of experts answered many questions during the live roundtable Q&A. Here are answers to those questions, as well as the ones we didn’t have time to address.

Q. What are the different data set sizes and workloads in AI/ML in terms of data set size, sequential/ random, write/read mix?

A. Data sets will vary incredibly from use case to use case. They may be GBs to possibly 100s of PB. In general, the workloads are very heavily reads maybe 95%+. While it would be better to have sequential reads, in general the patterns tend to be closer to random. In addition, different use cases will have very different data sizes. Some may be GBs large, while others may be <1 KB. The different sizes have a direct impact on performance in storage and may change how you decide to store the data.

Q. More details on the risks associated with the use of online databases?

A. The biggest risk with using an online DB is that you will be adding an additional workload to an important central system. In particular, you may find that the load is not as predictable as you think and it may impact the database performance of the transactional system. In some cases, this is not a problem, but when it is intended for actual transactions, you could be hurting your business.

Q. What is the difference between a DPU and a RAID / storage controller?

A. A Data Processing Unit or DPU is intended to process the actual data passing through it. A RAID/storage controller is only intended to handle functions such as data resiliency around the data, but not the data itself. A RAID controller might take a CSV file and break it down for storage in different drives. However, it does not actually analyze the data. A DPU might take that same CSV and look at the different rows and columns to analyze the data. While the distinction may seem small, there is a big difference in the software. A RAID controller does not need to know anything about the data, whereas a DPU must be programmed to deal with it. Another important aspect is whether or not the data will be encrypted. If the data will encrypted, a DPU will have to have additional security mechanisms to deal with decryption of the data. However, a RAID-based system will not be affected.

Q. Is a CPU-bypass device the same as a SmartNIC?

A. Not entirely. They are often discussed together, but a DPU is intended to process data, whereas a SmartNIC may only process how the data is handled (such as encryption, handle TCP/IP functions, etc.).  It is possible for a SmartNIC to also act as a DPU where the data itself is processed. There are new NVMe-oF™ technologies that are beginning to allow FPGA, TPD, DPU, GPU and other devices direct access to other servers’ storage directly over a high-speed local area network without having to access the CPU of that system.

Q. What work is being done to accelerate S3 performance with regard to AI?

A. A number of companies are working to accelerate the S3 protocol. Presto and a number of Big Data technologies use it natively. For AI workloads there are a number of caching technologies to handle the re-reads of training on a local system. Minimizing the performance penalty

Q. From a storage perspective, how do I take different types of data from different storage systems to develop a model?

A. Work with your project team to find the data you need and ensure it can be served to the ML/DL training (or inference) environment in a timely manner. You may need to copy (or clone) data on to a faster medium to achieve your goals. But look at the process as a whole. Do not underestimate the data cleansing/normalization steps in your storage analysis as it can prove to be a bottleneck.

Q. Do I have to "normalize" that data to the same type, or can a model accommodate different data types?

A. In general, yes. Models can be very sensitive. A model trained on one set of data with one set of normalizations may not be accurate if data that was taken from a different set with different normalizations is used for inference. This does depend on the model, but you should be aware not only of the model, but also the details of how the data was prepared prior to training.

Q. If I have to change the data type, do I then need to store it separately?

A. It depends on your data, "do other systems need it in the old format?"

Q. Are storage solutions that are right for one form of AI also the best for others?

A. No. While it may be possible to use a single solution for multiple AIs, in general there are differences in the data that can necessitate different storage. A relatively simple example is large data (MBs) vs. small data (~1KB). Data in that multiple MBs large example can be easily erasure coded and stored more cost effectively. However, for small data, Erasure Coding is not practical and you generally will have to go with replication.

Q. How do features like CPU bypass impact performance of storage?

A. CPU bypass is essential for those times when all you need to do is transfer data from one peripheral to another without processing. For example, if you are trying to take data from a NIC and transfer it to a GPU, but not process the data in any way, CPU bypass works very well. It prevents the CPU and system memory from becoming a bottleneck. Likewise, on a storage server, if you simply need to take data from an SSD and pass it to a NIC during a read, CPU bypass can really help boost system performance. One important note: if you are well under the limits of the CPU, the benefits of bypass are small. So, think carefully about your system design and whether or not the CPU is a bottleneck. In some cases, people will use system memory as a cache and in these cases, bypassing CPU isn’t possible.

Q. How important is it to use All-Flash storage compared to HDD or hybrid?

A. Of course, It depends on your workloads. For any single model, you may be able to make due with HDD. However, another consideration for many of the AI/ML systems is that their use can quite suddenly expand. Once there is some amount of success, you may find that more people will want access to the data and the system may experience more load. So beware of the success of these early projects as you may find your need for creation of multiple models from the same data could overload your system.

Q. Will storage for AI/ML necessarily be different from standard enterprise storage today?

A. Not necessarily. It may be possible for enterprise solutions today to meet your requirements. However, a key consideration is that if your current solution is barely able to handle its current requirements, then adding an AI/ML training workload may push it over the edge. In addition, even if your current solution is adequate, the size of many ML/DL models are growing exponentially every year.  So, what you provision today may not be adequate in a year or even several months.  Understanding the direction of the work your data scientists are pursuing is important for capacity and performance planning.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Tom Friend

Oct 20, 2021

title of post
What types of storage are needed for different aspects of AI? That was one of the many topics covered in our SNIA Networking Storage Forum (NSF) webcast “Storage for AI Applications.” It was a fascinating discussion and I encourage you to check it out on-demand. Our panel of experts answered many questions during the live roundtable Q&A. Here are answers to those questions, as well as the ones we didn’t have time to address. Q. What are the different data set sizes and workloads in AI/ML in terms of data set size, sequential/ random, write/read mix? A. Data sets will vary incredibly from use case to use case. They may be GBs to possibly 100s of PB. In general, the workloads are very heavily reads maybe 95%+. While it would be better to have sequential reads, in general the patterns tend to be closer to random. In addition, different use cases will have very different data sizes. Some may be GBs large, while others may be <1 KB. The different sizes have a direct impact on performance in storage and may change how you decide to store the data. Q. More details on the risks associated with the use of online databases? A. The biggest risk with using an online DB is that you will be adding an additional workload to an important central system. In particular, you may find that the load is not as predictable as you think and it may impact the database performance of the transactional system. In some cases, this is not a problem, but when it is intended for actual transactions, you could be hurting your business. Q. What is the difference between a DPU and a RAID / storage controller? A. A Data Processing Unit or DPU is intended to process the actual data passing through it. A RAID/storage controller is only intended to handle functions such as data resiliency around the data, but not the data itself. A RAID controller might take a CSV file and break it down for storage in different drives. However, it does not actually analyze the data. A DPU might take that same CSV and look at the different rows and columns to analyze the data. While the distinction may seem small, there is a big difference in the software. A RAID controller does not need to know anything about the data, whereas a DPU must be programmed to deal with it. Another important aspect is whether or not the data will be encrypted. If the data will encrypted, a DPU will have to have additional security mechanisms to deal with decryption of the data. However, a RAID-based system will not be affected. Q. Is a CPU-bypass device the same as a SmartNIC? A. Not entirely. They are often discussed together, but a DPU is intended to process data, whereas a SmartNIC may only process how the data is handled (such as encryption, handle TCP/IP functions, etc.).  It is possible for a SmartNIC to also act as a DPU where the data itself is processed. There are new NVMe-oF™ technologies that are beginning to allow FPGA, TPD, DPU, GPU and other devices direct access to other servers’ storage directly over a high-speed local area network without having to access the CPU of that system. Q. What work is being done to accelerate S3 performance with regard to AI? A. A number of companies are working to accelerate the S3 protocol. Presto and a number of Big Data technologies use it natively. For AI workloads there are a number of caching technologies to handle the re-reads of training on a local system. Minimizing the performance penalty Q. From a storage perspective, how do I take different types of data from different storage systems to develop a model? A. Work with your project team to find the data you need and ensure it can be served to the ML/DL training (or inference) environment in a timely manner. You may need to copy (or clone) data on to a faster medium to achieve your goals. But look at the process as a whole. Do not underestimate the data cleansing/normalization steps in your storage analysis as it can prove to be a bottleneck. Q. Do I have to “normalize” that data to the same type, or can a model accommodate different data types? A. In general, yes. Models can be very sensitive. A model trained on one set of data with one set of normalizations may not be accurate if data that was taken from a different set with different normalizations is used for inference. This does depend on the model, but you should be aware not only of the model, but also the details of how the data was prepared prior to training. Q. If I have to change the data type, do I then need to store it separately? A. It depends on your data, “do other systems need it in the old format?” Q. Are storage solutions that are right for one form of AI also the best for others? A. No. While it may be possible to use a single solution for multiple AIs, in general there are differences in the data that can necessitate different storage. A relatively simple example is large data (MBs) vs. small data (~1KB). Data in that multiple MBs large example can be easily erasure coded and stored more cost effectively. However, for small data, Erasure Coding is not practical and you generally will have to go with replication. Q. How do features like CPU bypass impact performance of storage? A. CPU bypass is essential for those times when all you need to do is transfer data from one peripheral to another without processing. For example, if you are trying to take data from a NIC and transfer it to a GPU, but not process the data in any way, CPU bypass works very well. It prevents the CPU and system memory from becoming a bottleneck. Likewise, on a storage server, if you simply need to take data from an SSD and pass it to a NIC during a read, CPU bypass can really help boost system performance. One important note: if you are well under the limits of the CPU, the benefits of bypass are small. So, think carefully about your system design and whether or not the CPU is a bottleneck. In some cases, people will use system memory as a cache and in these cases, bypassing CPU isn’t possible. Q. How important is it to use All-Flash storage compared to HDD or hybrid? A. Of course, It depends on your workloads. For any single model, you may be able to make due with HDD. However, another consideration for many of the AI/ML systems is that their use can quite suddenly expand. Once there is some amount of success, you may find that more people will want access to the data and the system may experience more load. So beware of the success of these early projects as you may find your need for creation of multiple models from the same data could overload your system. Q. Will storage for AI/ML necessarily be different from standard enterprise storage today? A. Not necessarily. It may be possible for enterprise solutions today to meet your requirements. However, a key consideration is that if your current solution is barely able to handle its current requirements, then adding an AI/ML training workload may push it over the edge. In addition, even if your current solution is adequate, the size of many ML/DL models are growing exponentially every year.  So, what you provision today may not be adequate in a year or even several months.  Understanding the direction of the work your data scientists are pursuing is important for capacity and performance planning.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Automating Discovery for NVMe IP-based SANs

Erik Smith

Oct 6, 2021

title of post

NVMe® IP-based SANs (including transports such as TCP, RoCE, and iWARP) have the potential to provide significant benefits in application environments ranging from the Edge to the Data Center. However, before we can fully unlock the potential of the NVMe IP-based SAN, we first need to address the manual and error prone process that is currently used to establish connectivity between NVMe Hosts and NVM subsystems.  This process includes administrators explicitly configuring each Host to access the appropriate NVM subsystems in their environment. In addition, any time an NVM Subsystem interface is added or removed, a Host administrator may need to explicitly update the configuration of impacted hosts to reflect this change. 

Due to the decentralized nature of this configuration process, using it to manage connectivity for more than a few Host and NVM subsystem interfaces is impractical and adds complexity when deploying an NVMe IP-based SAN in environments that require a high-degrees of automation.

For these and other reasons, several companies have been collaborating on innovations that simplify and automate the discovery process used with NVMe IP-based SANs. This will be the topic of our live webcast on November 4, 2021 “NVMe-oF: Discovery Automation for IP-based SANs.”

During this session we will explain:

  • The NVMe IP-based SAN discovery problem
  • The types of network topologies that can support the automated discovery of NVMe-oF Discovery controllers
  • Direct Discovery versus Centralized Discovery
  • An overview of the discovery protocol

We hope you will join us. The experts working to address this limitation with NVME IP-based SANs will be on-hand to directly answer your questions on November 4th. Register today.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Automating Discovery for NVMe IP-based SANs

Erik Smith

Oct 6, 2021

title of post
NVMe® IP-based SANs (including transports such as TCP, RoCE, and iWARP) have the potential to provide significant benefits in application environments ranging from the Edge to the Data Center. However, before we can fully unlock the potential of the NVMe IP-based SAN, we first need to address the manual and error prone process that is currently used to establish connectivity between NVMe Hosts and NVM subsystems.  This process includes administrators explicitly configuring each Host to access the appropriate NVM subsystems in their environment. In addition, any time an NVM Subsystem interface is added or removed, a Host administrator may need to explicitly update the configuration of impacted hosts to reflect this change. Due to the decentralized nature of this configuration process, using it to manage connectivity for more than a few Host and NVM subsystem interfaces is impractical and adds complexity when deploying an NVMe IP-based SAN in environments that require a high-degrees of automation. For these and other reasons, several companies have been collaborating on innovations that simplify and automate the discovery process used with NVMe IP-based SANs. This will be the topic of our live webcast on November 4, 2021 “NVMe-oF: Discovery Automation for IP-based SANs.” During this session we will explain:
  • The NVMe IP-based SAN discovery problem
  • The types of network topologies that can support the automated discovery of NVMe-oF Discovery controllers
  • Direct Discovery versus Centralized Discovery
  • An overview of the discovery protocol
We hope you will join us. The experts working to address this limitation with NVME IP-based SANs will be on-hand to directly answer your questions on November 4th. Register today.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Q&A (Part 2) from “Storage Trends for 2021 and Beyond” Webcast

Andee Wilcott

Oct 4, 2021

title of post

Questions from “Storage Trends for 2021 and Beyond” Webcast Answered

This is part two of the Q&A portion of the roundtable talk between Rick Kutcipal, board director, SCSI Trade Association (STA); Jeff Janukowicz, Research vice president at IDC; and Chris Preimesberger, former editor-in-chief of eWeek, where they discussed prominent data storage technologies shaping the market. If you missed this webcast titled “Storage Trends for 2021 and Beyond,” it’s available on demand here. Part One of the Q&A can be found at https://www.scsita.org/library/qa-part-1-from-storage-trends-for-2021-and-beyond-webcast/. Q1. What are your views around NVMe over Fabrics? A1. NVMe technology enables the benefits of flash-based storage to be realized at a much larger scale and not limited to the confines of PCIe backplane-based systems. With NVMe-oF technology, it’s possible to attach many SSDs in a network — generally far more than the number that can be accommodated via PCIe backplane-based systems. With NVMe-oF technology, high performance, low latency flash-based storage resources can be disaggregated from the servers and pooled into a network-attached, shared resource. With this pooling, the ability to provision just the right amount of storage for each workload on each server within the data center is achievable. This highly resembles SAN technology that SAS/SCSI has been implementing for decades. SAS inherently scales from direct-attach topologies to large-scale storage systems with hundreds (if not thousands) of drives. Q2. Where can we get the 24G SAS specs? A2. The T10 Technical Committee of the InterNational Committee for Information Technology Standards (INCITS) develops the technical standards concerning the SAS specifications. INCITS is accredited by, and operates under rules that are approved by, the American National Standards Institute (ANSI). These rules are designed to ensure that voluntary standards are developed by the consensus of industry groups. Please visit the T10 Working Drafts site for the most comprehensive list of documents for the SAS technical specifications. Anyone can access the drafts until they are published by ANSI, after which they are available for purchase from the ANSI eStandards Store at https://webstore.ansi.org/SDO/INCITS. Q3. Do you see any effort going into optical SAS? Will it get any coverage at the next plugfest? A3. Optical 24G SAS cable samples are available today and have been tested during the 24G SAS plugfest. We are seeing an increased interest in active 24G SAS cables, both optical and copper, and expect the trend to continue. Q4. Do you have any report or study about Intel DC Persistent Memory usage for the next 5 years? A4. Per the Emerging Non-Volatile Memory report, Market and Technology Report 2020 (Yole Developpement, www.yole.fr):
  • The stand-alone emerging NVM market is dynamic and is expected to grow with a CAGR (2019-2025) of ~42%, reaching more than $4B by 2025. 3D XPoint-based products for the datacenter space will play a key role in sustaining this growth.
  • The stand-alone STT-MRAM market will be driven by adoption in low-latency storage (e.g., SSD caching), while RRAM could experience a resurgence thanks to the introduction of new low-latency RRAM-based drives by Japanese players.
Q5. Do you see NVMe/TCP becoming the dominant NVMe-oF protocol? A5. Yes, NVMe-oF using TCP will likely become the preferred highly scalable NVMe protocol as it matures. Today, while NVMe-oF using TCP is not yet fully mature, recent demonstrations have shown comparable throughput, compared to RDMA-based protocols, but there is a need for more efficient processing implementations. Q6. Why has there been such a delay in getting NVMe Hardware RAID controllers out into the mainstream market? I finally see some tri-mode (SAS/SATA/NVMe) controllers becoming available – do you see these being widely adopted in the server market? A6. Implementation of hardware RAID is difficult and requires a very intricate interaction between the HW RAID engine and the storage protocol. Today’s RAID engines have been developed and hardened over many generations of products and to introduce a new storage protocol will take time. Innovations to align with the requirements of NVMe hardware RAID will begin to emerge in the near future. Q7. What do you see as the crossover timeline for NVMe replacing SATA SSDs? A7. NVMe has already replaced SATA in PC and mobile compute systems. In the enterprise, value SAS is replacing a lot of SATA SSDs due to its near price parity with SATA. Q8.1 Can you please share with us the roadmap of 24G SAS Disk Drives? A8.1 The next generation of 24G SAS drives are available now and should become mainstream in 2022. It is expected that the next generation of SAS technology will include improvements made to the drives and to 24G SAS infrastructure, rather than turning the speed crank to 48G SAS. Q8.2 Can you please share with us the current status of 24G SAS RAID controllers, JBOD Expanders? A8.2 With a second 24G SAS Plugfest to be concluded in Q4’2021, the SAS ecosystem will reach a major milestone for production readiness. The specification is done, and most major system integrators are investing in it today. 24G SAS RAID/HBA controllers and adapter cards, SSDs, and analyzer products have already been announced, with PCIe 4.0 based servers deploying 24G SAS solutions shipping now. Further product introductions are expected to continue into 2022. Going forward, 24G SAS storage systems are expected within a few quarters after server launches. Q8.3 In terms of the SAS spec, can you please list the major changes to 24G SAS from 12Gb/s SAS? A8.3 In addition to doubling the effective bandwidth from 12Gb/s SAS, 24G SAS improvements over 12Gb/s SAS include:
  • 20-bit Forward Error Correction (FEC)
  • 128b/130b encoding
  • Active PHY Transmitter Adjustment (APTA)
  • Fairness and persistent connection enhancements
  • Storage intelligence to optimize SSD performance
Q9. The IDC slide does not show much SATA SSDs at all…SATA SSDs share was quite prevalent 2015-2020. Agree? A9. The IDC slide shows enterprise storage capacity shipped, and while SAS and SATA represent a large portion of the storage shipped in 2015-2020, the vast majority of that capacity was in SAS and SATA HDDs, not SSDs.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Subscribe to