Sorry, you need to enable JavaScript to visit this website.

A Q&A on Data Literacy

Jim Fister

Oct 5, 2020

title of post
The SNIA Cloud Storage Technologies Initiative (CSTI) recently hosted a conversation with Glyn Bowden from HPE that I moderated on “Using Data Literacy to Drive Insight.”  In a wide-ranging conversation just over 45 minutes, we had a great discussion on a variety of topics related to ensuring the accuracy of data in order to draw the right conclusions using current examples of data from the COVID-19 pandemic as well as law enforcement. In the process of the dialog, some questions and comments arose, and we’re collecting them in this blog. Q. So who really needs Data Literacy skills? A: Really, everyone does.  We all make decisions in our daily life, and it helps to understand the provenance of the information being presented.  It’s also important to find ways to the source material for the data when necessary in order to make the best decisions. Everyone can benefit from knowing more about data.  We all need to interpret the information offered to us by people, press, journals, educators, colleagues, friends. Q. What’s an example of “everyone” who needs data literacy? A. I offered an example of my work as a board member in my local police and fire district, where I took on the task of statistical analysis of federal, state, and local COVID-19 data in order to estimate cases in the district that would affect the policies and procedures of the service district personnel. Glyn also offered simple examples of the differences between sheer numbers compared to percentages, and how they should be compared and contrasted. We cited some of the regional variations of COVID data given the methodologies of the people reporting it. There are many other examples of literacy that were shared in the material, including some wonderful data around emergency service call personnel, weather, pubs, paydays, and lunar cycles. Why haven’t you started watching it yet? Remember its on-demand along with the presentation slides. Q. What’s the impact of bias in the “data chain”? A. Bias can come from anywhere.  Even the more “pure” providers of source data (in this case, doctors or hospital data scientists) can “pollute” the data. What you need to do to qualify the report is to determine how much trust you have in the provider of the data. Glyn cited several examples of how the filter of the interpreter can provide bias that must be understood by a viewer of the data. “Reality is an amplifier of bias,” was the non-startling conclusion.  Glyn made an interesting comment on bias: When you see the summary, the first questions you should ask are what’s been left out and why was it left out? What’s left out is usually what creates the bias. It’s also useful to look for any data that supports a counter-opinion, which might lead you to additional source material. Q. On the concept of data modeling.  At some point, you create a predictive model.  First, how useful is it to review that model?  And what does an incorrect model mean? A. You MUST review a model, you can’t assume that it will always be true, since you’re acting with the data you have, and more will always come in to affect the model. You need to review it, and you should pick a regular cadence. If you see something that is wrong in the model, it could mean that you have incomplete data or have injected bias. Glyn offered a great example of empty or full trash containers. Q. So, the validity of the data model itself is actually data that you need to adjust your assumptions? A. Absolutely. More data of any kind should affect the development of the next model. Everything needs to be challenged. Q. Would raw data therefore be the best data? A. Raw data could have gaps that haven’t been filled yet, or it might have sensor error of some type. There’s a necessity to clean data, though be aware that cleaning of the raw data has a potential to inject bias. It takes judgment and model creation to validate your methods for cleaning data. Q. Would it be worthwhile to run the models on both cleaned and raw data to see if the model holds up in a similar way? A. Yes, and this is the way that many artificial intelligence systems are trained. Q. Another question that could occur would be data flow compared to data itself. Is the flow of the data something that can be insightful? A. Yes. The flow of data, and the iteration of the data through its lifecycle can affect the accuracy. You won’t really know how it’s skewed until you look at the model, but make a determination and test that in order to see. Q. How does this affect data and data storage? A. As more data is collected and analyzed, we’ll start to see different patterns emerge in our use of storage. So, analysis of your storage needs is another data model for you to consider! Please feel free to view and comment, and we’d be happy to hear about future webcasts that would interest you.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

An FAQ on Data Reduction Fundamentals

John Kim

Oct 5, 2020

title of post
There’s a fair amount of confusion when it comes to data reduction terminology and techniques. That’s why the SNIA Networking Storage Forum (NSF) hosted a live webcast, “Everything You Wanted to Know About Storage But Were Too Proud to Ask: Data Reduction.”  It was a 101-level lesson on the fundamentals of data reduction, which can be performed in different places and at different stages of the data lifecycle. The goal was to clear up confusion around different data reduction and data compression techniques and set the stage for deeper dive webcasts on this topic (see the end of this blog for info on those). As promised during the webcast, here are answers to the questions we didn’t have time to address during the live event. Q. Does block level compression have any direct advantage over file level compression? A. One significant advantage is not requiring the entire thing, the file or database or whatever we’re storing, to be compressed and decompressed as a unit. That would almost certainly increase read latency, and for large files, require quite a bit of caching. In the case of blocks, a single block can be the compression unit, even if it’s part of a file, database or other larger data structure. Compressing a block is much faster and computationally less intensive, which is reflected in reduced latency overhead and cache impacts. Q. You made it sound like thin provisioning had no overhead but on-demand allocation is an overhead and can be quite bad at the worst time.  Do you agree? A. Finding free space when the system is at capacity may be an issue, and this may indeed cause significant slowdowns. This is an undesirable situation, and the advice is never to run so close to the capacity wire that thin provisioning impacts performance or jeopardizes successfully writing the data. In a system with adequate amounts of free space, caching can make the normally small overhead of thin provisioning very small to unmeasurable. Q. Will migration to SSD zoning vs. HDD based block/pages impact data compression? A. It shouldn’t, since compression is done at a level where zoning isn’t an issue. Compression is only applicable to blocks or files. Q. Does compressing blocks on computational storage devices have the disadvantage of not reducing the PCIe bandwidth since raw data has to be transferred over to the storage devices? A. Yes. But the same is true of any storage device; so computational storage is no worse in respect of the transfer of the data, but it provides much more apparent storage on the device once it gets there. A computational storage device requires no application changes to do this. Q. How do we measure performance in out-of- line <data> reduction? A. Data reduction techniques like compression and deduplication can be done in-line (that is, while writing the data) or out-of-line (as a later point in time). Out-of-line shifts the compute required from now—where big horsepower is required if there’s to be no impact on storage performance, to later, where smaller processors can take their time. Out-of-line data reduction requires more space to store the data, as it’s unreduced when it’s written. These tradeoffs also have impacts on performance (both back-end latency and bandwidth). This all impacts the total cost of the system. It’s not so much that we need to measure the performance of in-line vs. out-of-line, something we know how to do, and declare one a winner; but it’s whether the system provides us the needed performance at the right cost. That’s a purchasing decision, not a technology one. Q. How do customers (or vendors) decide how wide their deduplication net should be, i.e. one disk, per file, across one file system, one storage system, or multiple storage systems? A. By testing and balancing the savings vs. the cost. One thing is true: the balance right now is very definitely in favor of deduplicating at every level where possible. Vendors can demonstrate huge space savings advantages by doing so. Consumers, as indicated by my answer to the previous question, need to look at the whole system and its cost vs. performance, and buy on that basis. Q. Is compression like doing deduplication on a very small and very local scale? A. You could think of it as bit-level deduplication, and then realize that you can stretch an analogy to breaking point… Q. Are some blocks or files so small that it’s not worth doing deduplication or cloning because the extra metadata will be larger than the block/file space savings? A. Yes. They’re often stored as is – but they do need metadata to say that they’re raw and not reduced. Q. Do cloning and snapshots operate only at the block level or can they operate at the file or object level too? A. Cloning and snapshots can operate at the file or object level, as long as there is an efficient way of extracting and storing the differences. Sometimes it’s cheaper and simpler just to copy the whole thing, especially for small files or objects. Q. Why does (Virtual Data Optimizer) VDO do dedupe before compression if the other way is preferable? Why is it better to compress then deduplicate? A. That’s a decision that the designers of VDO felt gave them the best storage efficiencies and reasonable compute overheads. (It’s also not the only system that uses this order.) But the dedupe scope of VDO is relatively small. Compression then deduplication allows in-line compression with out-of-line and much broader deduplication across very large sets of data, and there are many systems that use this order for that reason. Q. There’s also so much stuff because we (as an industry) have enabled storing so much stuff. (cheaply/affordably) ? Today’s business and storage market would look and act differently if costs were different. Data reduction’s interaction with encryption (e.g. proper ordering) could be useful to mention. Or a topic for another presentation! A. We’ll consider it! Remember I said we were taking a deeper dive on the topic of data reduction? We have two more webcast in this series – one on compression and the other on data deduplication. You can access them here:

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Optimizing NVMe over Fabrics Performance Q&A

Tom Friend

Oct 2, 2020

title of post

Almost 800 people have already watched our webcast “Optimizing NVMe over Fabrics Performance with Different Ethernet Transports: Host Factors” where SNIA experts covered the factors impacting different Ethernet transport performance for NVMe over Fabrics (NVMe-oF) and provided data comparisons of NVMe over Fabrics tests with iWARP, RoCEv2 and TCP. If you missed the live event, watch it on-demand at your convenience.

The session generated a lot of questions, all answered here in this blog. In fact, many of the questions have prompted us to continue this discussion with future webcasts on NVMe-oF performance. Please follow us on Twitter @SNIANSF for upcoming dates.

Q. What factors will affect the performance of NVMe over RoCEv2 and TCP when the network between host and target is longer than typical Data Center environment? i.e., RTT > 100ms

A. For a large deployment with long distance, congestion management and flow control will be the most critical considerations to make sure performance is guaranteed. In a very large deployment, network topology, bandwidth subscription to storage target, and connection ratio are all important factors that will impact the performance of NVMe-oF.

Q. Were the RoCEv2 tests run on 'lossless' Ethernet and the TCP tests run on 'lossy' Ethernet?

A. Both iWARP and RoCEv2 tests were run in a back to back configuration without a switch in the middle, but with Link Flow Control turned on.

Q. Just to confirm, this is with pure ROCEv2? No TCP, right? ROCEv2 end 2 end (initiator 2 target)?

A. Yes, for RoCEv2 test, that was RoCEv2 Initiator to RoCEv2 target.

Q. How are the drives being preconditioned? Is it based on I/O size or MTU size? 

A. Storage is pre-conditioned by I/O size and type of the selected workload. MTU size is not relevant.  The selected workload is applied until performance changes are time invariant - i.e. until performance stabilizes within a range known as steady state.  Generally, the workload is tracked by specific I/O size and type to remain within a data excursion of 20% and a slope of 10%.

Q. Are the 6 SSDs off a single Namespace, or multiple? If so, how many Namespaces used?

A. Single namespace.

Q. What I/O generation tool was used for the test?

A. Calypso CTS IO Stimulus generator which is based on libaio. CTS has same engine as fio and applies IOs to the block IO level.  Note vdbench and iometer are java-based file system level and higher in the software stack.

Q. Given that NVMe SSD performance is high with low latency, is it not that the performance bottleneck is shifted to the storage controller?

A. Test I/Os are applied to the logical storage seen by host on the target server in our attempt to normalize the host and target in order to assess NIC-Wire-NIC performance. The storage controller is beneath this layer and not applicable to this test. If we test the storage directly on the target - not over the wire - then we can see impact of the controller and controller related issues (such as garbage collection, over provisioning, table structures, etc.)

Q. What are the specific characteristics of RoCEv2 that restrict it to 'rack' scale deployments?  In other words, what is restricting it from larger scale deployments?

A. RoCEv2 can, and does, scale beyond the rack if you have one of three things:

  1. A lossless network with DCB (priority flow control)
  2. Congestion management with solutions like ECN
  3. Newer RoCEv2-capable adapters that support out of order packet receive and selective re-transmission

Your mileage will vary based upon features of different network vendors.

Q. Is there an option to use some caching mechanism on host side?

A. Host side has RAM cache per platform set up but is held constant among these tests. 

Q. Was there caching in the host?

A. The test used host memory for NVMe over Fabrics.

Q. Were all these topics from the description covered?  In particular, #2?
We will cover the variables:

  1. How many CPU cores are needed (I’m willing to give)?
  2. Optane SSD or 3D NAND SSD?
  3. How deep should the Q-Depth be?
  4. Why do I need to care about MTU?

A. Cores - see TC/QD sweep to see optimal OIO.  Core Usage/Required can be inferred from this. Note incongruity of TC/QD to OIO 8, 16, 32, 48 in this case.  

  1. The test used a dual socket server on target with IntelÒ XeonÒ Platinum 8280L processor with 28 cores. Target server only used one processor so that all the workloads were on a single NUMA node. 1-4% CPU utilization is the average of 28 cores.
  2. SSD-1 is Optane SSD, SSD-2 is 3D NAND.
  3. Normally QD is set to 32.
  4. You do not need to care about MTU, at least in our test, we saw minimal performance differences.

Q. The result of 1~4% of CPU utilization on target is based on single SSD? Do you expect to see much higher CPU utilization if the amount of SSD increases?

A. CPU % is the target server for the 6 SSD LUN.

Q. Is there any difference between the different transports and the sensitivity of lost packets?

A. Theoretically, iWARP and TCP are more tolerant to packet lost. iWARP is based on TCP/IP, TCP provides flow control and congestion management that can still perform in a congested environment. In the event of packet loss, iWARP supports selective re-transmission and out of order packet receive, those technology can further improve the performance in a lossy network. While, RoCEv2 standard implementation does not tolerate packet loss and would require lossless network and would experience performance degradation when packet loss happens.

Q. 1. When you mean offload TCP, is this both at Initiator and target side or just host initiator side?
2. Do you see any improvement with ADQ on TCP?

A. RDMA iWARP in the test has a complete offload TCP engine on the network adapter on both Initiator and target side. Application Device Queues (ADQ) can significantly improve throughput, latency and most importantly latency jitter with dedicated CPU core allocated for NVMe-oF solutions.

Q. Since the CPU utilization is extremely low on the host, any comments about the CPU role in NVMe-oF and the impact of offloading?

A. NVMe-oF was designed to reduce the CPU load on target as shown in the test. On the initiator side CPU load will be a little bit higher. RDMA, as an offloaded technology, requires fairly minimal CPU utilization. NVMe over TCP still uses TCP stack in the kernel to do all the work, thus CPU still plays an important role. Also, the test was done with a high-end IntelÒ XeonÒ  Processor with very powerful processing capability, if a processor with less processing power is used, CPU utilization would be higher.

Q. 1. What should be the ideal incapsulated data (inline date) size for best performance in a real-world scenario? 2. How could one optimize buffer copies at block level in NVMe-oF?

A. 1. There is no simple answer to this question. The impact of incapsulated data size to performance in the real-world scenario is more complicated as switch is playing a critical role in the whole network. Whether there is a shallow buffer switch or a deep buffer switch, switch settings like policy, congestion management etc. would all impact the overall performance. 2. There are multiple explorations to improve the performance of NVMe-oF by reducing or optimizing buffer copies. One possible option is to use controller memory buffer introduced in NVMe Specification 1.2.

Q. Is it possible to combine any of the NVMe-of technologies with SPDK - user space processing?

A. SPDK currently supports all these Ethernet-based transports: iWarp, RoCEv2 and TCP.

Q. You indicated that TCP is non-offloaded, but doesn't it still use the 'pseudo-standard' offloads like Checksum, LSO, RSS, etc?  It just doesn't have the entire TCP stack offloaded?

A. Yes, stateless offloads are supported and used.

Q. What is the real idea in using 4 different SSDs? Why didn't you use 6 or 8 or 10? What is the message you are trying to relay? I understand that SSD1 is higher/better performing than SSD2.

A. We used a six SSD LUN in both SSD-1 and SSD-2.  We compared higher performance - lower capacity Optane to lower performance - higher capacity NVMe.  Note NVMe is 10X capacity of Optane.

Q. It looks like one of the key takeaways is that SSD specs matter. Can you explain (without naming brands) the main differences between SSD-1 and SSD-2?

A. Manufacturer specs are only a starting point and actual performance depends on the workload.  Large differences are seen for small block RND W workloads and large block SEQ R workloads.

Q. What is the impact to the host CPU and memory during the tests? Wondering what minimum CPU and memory are necessary to achieve peak NVMe-oF performance, which leads to describe how much application workload one might be able to achieve.

A. The test did not limit CPU core or memory to try the minimal configuration to achieve peak NVMe-oF performance. This might be an interesting topic we can cover in the future presentation.  (We measured target server CPU usage, not host / initiator CPU Usage).

Q. Did you let the tests run for 2 hours and then take results? (basically, warm up the cache/SSD characterization)?

A. We precondition with the TC/QD Sweep test then run the remaining 3 tests back to back to take advantage of the preconditioning done in the first test.

Q. How do you check outstanding IOs?

A. We use OIO = TC x QD in test settings and populate each thread with the QD jobs. We do not look at in flight OIO, but wait for all OIOs to complete and measure response times.

Q. Where can we get the performance test specifications as defined by SNIA?

A. You can find the test specification on the SNIA website here.

Q. Have these tests been run using FC-NVMe. If so, how did they fare?

A. We have not yet run tests your NVMe over Fibre Channel.

Q. What tests did you use? FIO, VDBench, IOZone, or just DD or IOMeter? What was the CPU peak utilization? and what CPUs did you use?

A. CTS IO generator which is similar to fio as both are based on libaio and test at the block level.  Vdbench, iozone and Iometer are java file system level.  DD is direct and lacks complex scripting.  Fio allows compiles scripting but not multiple variables per loop - i.e. requires iterative tests and post-test compilation vs. CTS which has multi variable - multi loop concurrency.

Q. What test suites did you use for testing?

A. Calypso CTS tests

Q. I heard that iWARP is dead?

A. No, iWARP is not dead. There are multiple Ethernet network adapter vendors supporting iWARP now. The adapter used in the test supports iWARP, RoCEv2 and TCP at the same time.

Q. Can you post some recommendation on the switch setup and congestion?

A. The test talked about in this presentation used back to back configuration without switch. We will have a presentation in the near future to take into account switch settings and will share more information at that time. Don’t forget to follow us on Twitter @SNIANSF for dates of upcoming webcasts.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Optimizing NVMe over Fabrics Performance Q&A

Tom Friend

Oct 2, 2020

title of post
Almost 800 people have already watched our webcast “Optimizing NVMe over Fabrics Performance with Different Ethernet Transports: Host Factors” where SNIA experts covered the factors impacting different Ethernet transport performance for NVMe over Fabrics (NVMe-oF) and provided data comparisons of NVMe over Fabrics tests with iWARP, RoCEv2 and TCP. If you missed the live event, watch it on-demand at your convenience. The session generated a lot of questions, all answered here in this blog. In fact, many of the questions have prompted us to continue this discussion with future webcasts on NVMe-oF performance. Please follow us on Twitter @SNIANSF for upcoming dates. Q. What factors will affect the performance of NVMe over RoCEv2 and TCP when the network between host and target is longer than typical Data Center environment? i.e., RTT > 100ms A. For a large deployment with long distance, congestion management and flow control will be the most critical considerations to make sure performance is guaranteed. In a very large deployment, network topology, bandwidth subscription to storage target, and connection ratio are all important factors that will impact the performance of NVMe-oF. Q. Were the RoCEv2 tests run on ‘lossless’ Ethernet and the TCP tests run on ‘lossy’ Ethernet? A. Both iWARP and RoCEv2 tests were run in a back to back configuration without a switch in the middle, but with Link Flow Control turned on. Q. Just to confirm, this is with pure ROCEv2? No TCP, right? ROCEv2 end 2 end (initiator 2 target)? A. Yes, for RoCEv2 test, that was RoCEv2 Initiator to RoCEv2 target. Q. How are the drives being preconditioned? Is it based on I/O size or MTU size?  A. Storage is pre-conditioned by I/O size and type of the selected workload. MTU size is not relevant.  The selected workload is applied until performance changes are time invariant – i.e. until performance stabilizes within a range known as steady state.  Generally, the workload is tracked by specific I/O size and type to remain within a data excursion of 20% and a slope of 10%. Q. Are the 6 SSDs off a single Namespace, or multiple? If so, how many Namespaces used? A. Single namespace. Q. What I/O generation tool was used for the test? A. Calypso CTS IO Stimulus generator which is based on libaio. CTS has same engine as fio and applies IOs to the block IO level.  Note vdbench and iometer are java-based file system level and higher in the software stack. Q. Given that NVMe SSD performance is high with low latency, is it not that the performance bottleneck is shifted to the storage controller? A. Test I/Os are applied to the logical storage seen by host on the target server in our attempt to normalize the host and target in order to assess NIC-Wire-NIC performance. The storage controller is beneath this layer and not applicable to this test. If we test the storage directly on the target – not over the wire – then we can see impact of the controller and controller related issues (such as garbage collection, over provisioning, table structures, etc.) Q. What are the specific characteristics of RoCEv2 that restrict it to ‘rack’ scale deployments? In other words, what is restricting it from larger scale deployments? A. RoCEv2 can, and does, scale beyond the rack if you have one of three things:
  1. A lossless network with DCB (priority flow control)
  2. Congestion management with solutions like ECN
  3. Newer RoCEv2-capable adapters that support out of order packet receive and selective re-transmission
Your mileage will vary based upon features of different network vendors. Q. Is there an option to use some caching mechanism on host side? A. Host side has RAM cache per platform set up but is held constant among these tests. Q. Was there caching in the host? A. The test used host memory for NVMe over Fabrics. Q. Were all these topics from the description covered?  In particular, #2? We will cover the variables:
  1. How many CPU cores are needed (I’m willing to give)?
  2. Optane SSD or 3D NAND SSD?
  3. How deep should the Q-Depth be?
  4. Why do I need to care about MTU?
A. Cores – see TC/QD sweep to see optimal OIO.  Core Usage/Required can be inferred from this. Note incongruity of TC/QD to OIO 8, 16, 32, 48 in this case.
  1. The test used a dual socket server on target with IntelÒ XeonÒ Platinum 8280L processor with 28 cores. Target server only used one processor so that all the workloads were on a single NUMA node. 1-4% CPU utilization is the average of 28 cores.
  2. SSD-1 is Optane SSD, SSD-2 is 3D NAND.
  3. Normally QD is set to 32.
  4. You do not need to care about MTU, at least in our test, we saw minimal performance differences.
Q. The result of 1~4% of CPU utilization on target is based on single SSD? Do you expect to see much higher CPU utilization if the amount of SSD increases? A. CPU % is the target server for the 6 SSD LUN. Q. Is there any difference between the different transports and the sensitivity of lost packets? A. Theoretically, iWARP and TCP are more tolerant to packet lost. iWARP is based on TCP/IP, TCP provides flow control and congestion management that can still perform in a congested environment. In the event of packet loss, iWARP supports selective re-transmission and out of order packet receive, those technology can further improve the performance in a lossy network. While, RoCEv2 standard implementation does not tolerate packet loss and would require lossless network and would experience performance degradation when packet loss happens. Q. 1. When you mean offload TCP, is this both at Initiator and target side or just host initiator side? 2. Do you see any improvement with ADQ on TCP? A. RDMA iWARP in the test has a complete offload TCP engine on the network adapter on both Initiator and target side. Application Device Queues (ADQ) can significantly improve throughput, latency and most importantly latency jitter with dedicated CPU core allocated for NVMe-oF solutions. Q. Since the CPU utilization is extremely low on the host, any comments about the CPU role in NVMe-oF and the impact of offloading? A. NVMe-oF was designed to reduce the CPU load on target as shown in the test. On the initiator side CPU load will be a little bit higher. RDMA, as an offloaded technology, requires fairly minimal CPU utilization. NVMe over TCP still uses TCP stack in the kernel to do all the work, thus CPU still plays an important role. Also, the test was done with a high-end IntelÒ XeonÒ  Processor with very powerful processing capability, if a processor with less processing power is used, CPU utilization would be higher. Q. 1. What should be the ideal incapsulated data (inline date) size for best performance in a real-world scenario? 2. How could one optimize buffer copies at block level in NVMe-oF? A. 1. There is no simple answer to this question. The impact of incapsulated data size to performance in the real-world scenario is more complicated as switch is playing a critical role in the whole network. Whether there is a shallow buffer switch or a deep buffer switch, switch settings like policy, congestion management etc. would all impact the overall performance. 2. There are multiple explorations to improve the performance of NVMe-oF by reducing or optimizing buffer copies. One possible option is to use controller memory buffer introduced in NVMe Specification 1.2. Q. Is it possible to combine any of the NVMe-of technologies with SPDK – user space processing? A. SPDK currently supports all these Ethernet-based transports: iWarp, RoCEv2 and TCP. Q. You indicated that TCP is non-offloaded, but doesn’t it still use the ‘pseudo-standard’ offloads like Checksum, LSO, RSS, etc?  It just doesn’t have the entire TCP stack offloaded? A. Yes, stateless offloads are supported and used. Q. What is the real idea in using 4 different SSDs? Why didn’t you use 6 or 8 or 10? What is the message you are trying to relay? I understand that SSD1 is higher/better performing than SSD2. A. We used a six SSD LUN in both SSD-1 and SSD-2.  We compared higher performance – lower capacity Optane to lower performance – higher capacity NVMe.  Note NVMe is 10X capacity of Optane. Q. It looks like one of the key takeaways is that SSD specs matter. Can you explain (without naming brands) the main differences between SSD-1 and SSD-2? A. Manufacturer specs are only a starting point and actual performance depends on the workload. Large differences are seen for small block RND W workloads and large block SEQ R workloads. Q. What is the impact to the host CPU and memory during the tests? Wondering what minimum CPU and memory are necessary to achieve peak NVMe-oF performance, which leads to describe how much application workload one might be able to achieve. A. The test did not limit CPU core or memory to try the minimal configuration to achieve peak NVMe-oF performance. This might be an interesting topic we can cover in the future presentation.  (We measured target server CPU usage, not host / initiator CPU Usage). Q. Did you let the tests run for 2 hours and then take results? (basically, warm up the cache/SSD characterization)? A. We precondition with the TC/QD Sweep test then run the remaining 3 tests back to back to take advantage of the preconditioning done in the first test. Q. How do you check outstanding IOs? A. We use OIO = TC x QD in test settings and populate each thread with the QD jobs. We do not look at in flight OIO, but wait for all OIOs to complete and measure response times. Q. Where can we get the performance test specifications as defined by SNIA? A. You can find the test specification on the SNIA website here. Q. Have these tests been run using FC-NVMe. If so, how did they fare? A. We have not yet run tests your NVMe over Fibre Channel. Q. What tests did you use? FIO, VDBench, IOZone, or just DD or IOMeter? What was the CPU peak utilization? and what CPUs did you use? A. CTS IO generator which is similar to fio as both are based on libaio and test at the block level.  Vdbench, iozone and Iometer are java file system level.  DD is direct and lacks complex scripting.  Fio allows compiles scripting but not multiple variables per loop – i.e. requires iterative tests and post-test compilation vs. CTS which has multi variable – multi loop concurrency. Q. What test suites did you use for testing? A. Calypso CTS tests Q. I heard that iWARP is dead? A. No, iWARP is not dead. There are multiple Ethernet network adapter vendors supporting iWARP now. The adapter used in the test supports iWARP, RoCEv2 and TCP at the same time. Q. Can you post some recommendation on the switch setup and congestion? A. The test talked about in this presentation used back to back configuration without switch. We will have a presentation in the near future to take into account switch settings and will share more information at that time. Don’t forget to follow us on Twitter @SNIANSF for dates of upcoming webcasts.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Keeping Up with 5G, IoT and Edge Computing

Michael Hoard

Oct 1, 2020

title of post
The broad adoption of 5G, Internet of things (IoT) and edge computing will reshape the nature and role of enterprise and cloud storage over the next several years. What building blocks, capabilities and integration methods are needed to make this happen? That will be the topic of discussion at our live SNIA Cloud Storage Technologies webcast on October 21, 2020 “Storage Implications at the Velocity of 5G Streaming.” Join my SNIA expert colleagues, Steve Adams and Chip Maurer, for a discussion on common questions surrounding this topic, including: 
  • With 5G, IoT and edge computing – how much data are we talking about?
  • What will be the first applications leading to collaborative data-intelligence streaming?
  • How can low latency microservices and AI quickly extract insights from large amounts of data?
  • What are the emerging requirements for scalable stream storage – from peta to zeta?
  • How do yesterday’s object-based batch analytic processing (Hadoop) and today’s streaming messaging capabilities (Apache Kafka and RabbitMQ) work together?
  • What are the best approaches for getting data from the Edge to the Cloud?
I hope you will register today and join us on October 21st. It’s live so please bring your questions!

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Keeping Up with 5G, IoT and Edge Computing

Michael Hoard

Oct 1, 2020

title of post
The broad adoption of 5G, Internet of things (IoT) and edge computing will reshape the nature and role of enterprise and cloud storage over the next several years. What building blocks, capabilities and integration methods are needed to make this happen? That will be the topic of discussion at our live SNIA Cloud Storage Technologies webcast on October 21, 2020 “Storage Implications at the Velocity of 5G Streaming.” Join my SNIA expert colleagues, Steve Adams and Chip Maurer, for a discussion on common questions surrounding this topic, including: 
  • With 5G, IoT and edge computing – how much data are we talking about?
  • What will be the first applications leading to collaborative data-intelligence streaming?
  • How can low latency microservices and AI quickly extract insights from large amounts of data?
  • What are the emerging requirements for scalable stream storage – from peta to zeta?
  • How do yesterday’s object-based batch analytic processing (Hadoop) and today’s streaming messaging capabilities (Apache Kafka and RabbitMQ) work together?
  • What are the best approaches for getting data from the Edge to the Cloud?
I hope you will register today and join us on October 21st. It’s live so please bring your questions!

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

An FAQ on the “Fine Print” of Cyber Insurance

Paul Talbut

Sep 30, 2020

title of post
Last month, the SNIA Cloud Storage Technologies Initiative, convened experts, Eric Hibbard and Casey Boggs, for a webcast on cyber insurance – a growing area to further mitigate risks from cyber attacks. However, as our attendees learned, cyber insurance is not as simple as buying a pre-packaged policy. If you missed the live event “Does Your Cyber Insurance Strategy Need a Tune-Up” you can watch it on-demand. Determining where and how cyber insurance fits in a risk management program generates a lot of questions. Our experts have provided answer sto them all here: Q. Do “mega” companies buy cyber insurance or do they self-insure? A. Many Fortune 500 companies do carry cyber insurance. The scope of coverage can vary significantly. Concerns over ransomware are often a driver. Publicly traded companies have a need to meet due care obligations and cyber insurance is a way of demonstrating this. Q. Insurance companies don’t like to pay out. I suspect making a claim is quite contentious? A. It depends on the nature of the claim and the amount of the claim. Most policies have exemptions, triggers, caps, etc. that have to be navigated. Avoiding payouts is bad business for insurance companies, which are operating in a very competitive space. Q. How much does cyber insurance cost?  Either an example or hypothetical. A. Due to all the factors involved (e.g., size of organization, market sector, location, type of organization, policy coverage/exemption, and others) it is not possible to make general estimates. That said, many insurers have on-line quote capabilities that can be used to explore basic options and pricing. Q. Do insurance companies do audits of actual practices, e.g. whether there are actual (vs. claimed) controls on insider access to confidential data? Either before issuing a policy or after an incident. If so, how are audits done? A. It depends on the nature of the coverage. The organization may need to supply certain documents (security policies, incident response plan, etc.) and make assertions about its operations. Policy discounts can be dependent on audits (insurer or third-party). Also, a claim may trigger an investigation/audit. Q. Is it possible that business executives see an insurance policy as simply a safeguard against a cyber-attack? A. Yes, it is possible and the fear of ransomware could be a key motivation. However, such a simplistic view is not likely to be productive. Cyber insurance needs to be an element of your overall risk program and carefully matched to the organization’s needs. You don’t want to learn that you purchased the wrong kind of insurance after an incident. That is like being victimized twice. Q. To what degree do businesses need to do risk assessment – is it not just an IT/data security problem? A. Assessing your risk and determining your risk appetite are critical prerequisites to purchasing cyber insurance. Without these insights there is no way for the organization to know what kind of coverage it should get. Such an activity should be driven by the CFO or someone with responsibility for the operations of the organization. IT (via the CIO) and data security (via the CISO) should play a supporting role but they should not be the drivers.

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Composable or Computational – Your Questions Answered!

Eli Tiomkin

Sep 24, 2020

title of post
Our recent webcast on Composable Infrastructure and Computational Storage raised some interesting questions. No need to compose your own answers – my co-presenter Philip Kufeldt and I answer them here! You can find the entire webcast video along with the slide PDF in the SNIA Educational Library. We also invite you and your colleagues to take 10 and watch three short videos on Computational Storage topics. Q (I’m) a little confused about moving data across, for example, NVMe-oF, as it consumes DDR bandwidth. Can you elaborate? A:  Any data moving in or out of server consumes DDR bandwidth by virtue of the DMAs done. Consider a simple NFS file server, where I as a client write a 1GiB file.  That data arriving from the client in the server first appears as a series of TCP packets. These packets arrive first in the NIC and then are DMA-ed across the PCIe bus into main memory where the TCP/IP stack deciphers and then ultimately delivers them to waiting NFS server software.  If you have a smart NFS implementation that copy from the PCIe NIC to main memory is the only copy in the process.  But you have consumed 1GiB of DDR BW.  Now the NFS server SW translates the request into a series of Block IO requests to an underlying storage device via a SATA/SAS/NVMe controller.  This controller will again DMA the data from memory to the device.  Another 1GiB of DDR BW is consumed. Traditionally this has not really been noticed because the devices consuming this data have been slow enough to throttle how quickly the DDR BW can be consumed.  Now we have SSDs capable of consuming several GiB/s of bandwidth per device. You can easily design an unbalanced system where the DDR bus actually traps storage throughput. Q:  Some vendors are now running virtual machines within their arrays. Would these systems be considered a computational storage system? A:  Computational Storage at SNIA and other working groups are defining it at the storage device level, not the system level of sorts without at least some sort of computational storage processor (CSP). While these systems have intelligence, it is not at the storage device level, but addressed about that level before the user sees it (still at the CPU level). Q:  For Composable Infrastructure, you mentioned CXL as a more evolved PCIe fabric, when it will actually be released?  How about using PCIe Gen4 as a fabric, as it’s available today? A:  PCIe 4 does not provide robust memory semantics, specifically the cache coherency needed by some of these devices. This is the exact purpose of CXL:  to extend PCIe to better support load store operations, including cache coherency, needed by memory and memory like devices. Q:  Computational storage moves processing into storage. Isn’t that the opposite of disaggregation in composable infrastructure? A:  It is and it isn’t. As said in the presentation, the diagram of CI was quite simplistic. I doubt there will ever be just processors connected to a fabric.  Just as processors have memory built into them, level 1-3 caches, you can envision CI processor elements having some amount of local RAM as part of the component, an external level 4 cache if you will.  Imagine a small PCB with a processor and some small number of DIMMs. Other memory resources might be across the fabric to complete the memory requirements of a composed system. Storage devices already have processor and memory components within them for processing IO requests.  Augmenting these resources to handle only portions of processing of the governing app. Allowing cycles to migrate to the data, not the entire app but some of the data centric portions of it.   This is exactly how TPU or GPU processing would work as well, migrating the computational portions of the app to the component.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Security & Privacy Regulations: An Expert Q&A

J Metz

Sep 24, 2020

title of post
Last month the SNIA Networking Storage Forum continued its Storage Networking Security Webcast series with a presentation on Security & Privacy Regulations. We were fortunate to have security experts, Thomas Rivera and Eric Hibbard, explain the current state of regulations related to data protection and data privacy. If you missed it, it’s available on-demand. Q. Do you see the US working towards a national policy around privacy or is it going to stay state-specified? A.  This probably will not happen anytime soon due to political reasons. Having a national policy on privacy is not necessarily a good thing, depending on your state. Such a policy would likely have a preemption clause and could be used to diminish requirements from states like CA and MA. Q. Can you quickly summarize the IoT law? Does it force IoT manufactures to continually support IoT devices (ie. security patches) through its lifetime? A. The California IoT law is vague, in that it states that devices are to be equipped with “reasonable” security feature(s) that are all of the following:
  • Appropriate to the nature and function of the device
  • Appropriate to the information it may collect, contain, or transmit
  • Designed to protect the device and any information contained therein from unauthorized access, destruction, use, modification, or disclosure
This is sufficiently vague that it may be left to lawyers to determine whether requirements have been met. It is also important to remember IoT is a nickname because the law applies to all “Connected devices” (i.e., any device, or other physical object that is capable of connecting to the Internet, directly or indirectly, and that is assigned an Internet Protocol address or Bluetooth address). It also states that if a connected device is equipped with a means for authentication outside a LAN, either a preprogrammed password that is unique to each device manufactured or a security feature that requires a user to generate a new means of authentication before access is granted to the device for the first time is required. Q. You didn’t mention Brexit – to date the plan is to follow GDPR but it may change, any thoughts? A. British and European Union courts recognize a fundamental right to data privacy under Article 8 of the binding November 1950, European Convention on Human Rights (ECHR). In addition, Britain had to implement GDPR as a member nation. Post-Brexit, the UK will not have to continue implementing GDPR as the other member countries in the EU. However, Britain will be subject to EU data transfer approval as a “third country” like the US. Speculation has been that Britain would attempt a “Privacy Shield” agreement modeled after the arrangement between the United States and the European Union. With the recent Court of Justice of the European Union issuance of a judgment declaring as “invalid” the European Commission’s Decision (EU) 2016/1250 of 12 July 2016 on the adequacy of the protection provided by the EU-U.S. Privacy Shield (i.e., the EU-U.S. Privacy Shield Framework is no longer a valid mechanism to comply with EU data protection requirements when transferring personal data from the European Union to the United States), such an approach is now unlikely. It is not clear what Britain will do at this point and, as with many elements of Brexit, Britain could find itself digitally isolated from the EU if data privacy is not handled as part of the separation agreement. Q. In thinking of privacy – what are your thoughts on encryption being challenged? By EARN IT act/LAED act, etc. It seems like that is going against a nation-wide privacy movement, if there is one. A. The US Government (and many others) have a love/hate relationship with encryption. They want everyone to use it to protect sensitive assets, unless you are a criminal and then they want you to do everything in the clear so they don’t have to work too hard to catch and prosecute you…or simply persecute you. The back-door argument is amusing because most governments don’t have the ability to prevent something like this from being exploited by attackers (non-Government types). If the US Government can’t secure its own personnel records, which potentially exposes every civil servant along with his/her families and colleagues to attacks, how could they protect something as important as a back-door? If you want to learn more about encryption, watch the Encryption 101 webcast we did as part of this series.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Non-Cryptic Answers to Common Cryptography Questions

AlexMcDonald

Sep 23, 2020

title of post
The SNIA Networking Storage Forum’s Storage Networking Security Webcast Series continues to examine the many different aspects of storage security. At our most recent webcast on applied cryptography, our experts dove into user authentication, data encryption, hashing, blockchain and more. If you missed the live event, you can watch it on-demand. Attendees of the live event had some very interesting questions on this topic and here are answer to them all: Q. Can hashes be used for storage deduplication?  If so, do the hashes need to be 100% collision-proof to be used for deduplication? A. Yes, hashes are often used for storage deduplication. It’s preferred that they be collision-proof but it’s not required if the deduplication software does a bit-by-bit comparison of any files that produce the same hash in order to verify if they really are identical or not. If the hash is 100% collision-proof then there is no need to run bit-by-bit comparisons of files that produce the same hash value. Q. Do cloud or backup service vendors use blockchain proof of space to prove to customers how much storage space is available or has been reserved?    A. There are some vendors who are using proof of space to map or plot the device. Once the device is plotted you can have a report which provides the summary of storage space available. Some vendors use it today. Since mining is the most popular application today, mining users use this information to report available space for mining pool applications. Can you use it for enterprise cloud to monitor the available disk space – absolutely. Q. If a vendor provides a guarantee of space to a customer using blockchain, does something prevent them from filling up the space before the customer uses that space? A. Once the disk is plotted there is no way for any other application to use it. It will be flagged as an error. In fact, it’s a really great way to ensure that no attacks are occurring on the disk itself. Each block of space is mapped and indexed. Q. I lost track during the explanation about proofs in blockchain, what are those algorithms used for? A. There are two concepts which are normally discussed and create the confusion. One is that Blockchain can use different cryptographic hash algorithms such as SHA-256 (one of the most popular), Whirpool, RIPEMD (RACE Integrity Primitives Evaluation Message Digest), Dagger-Hashimoto and others). Mercle tree is a blockchain construct which allows one to build a chain by using hashes and data blocks. Consensus protocols is protocol for decision making such as Proof of Work, Proof of Space, Proof of Stake and etc. Each consensus protocol is using the distributed ledger to make a record for the block of data transferred. Use of cryptography hashes allows us to create trustless concept with encrypting data which is being transferred from point A to point B. The consensus protocol allows us to keep the record of the data blocks in distributed ledgers. This is a brief answer to the question and if you would like to get additional information please contract olga@myactionspot.com I will be happy to deliver the detailed session to address this topic. Q. How does encryption work in Storage Replication? Please advise whether this exists? A. Yes it exists. Encryption can be applied to data at rest and that encrypted data can be replicated, and/or the replication process can encrypt the data temporarily while it’s in transit. Q. Regarding blockchain: assuming a new transaction (nobody has information yet), is it possible that when sending the broadcast someone modifies part of the data (0.1% for example) and this data continues to travel over the network without being considered corrupted? A. The first block of data which is building the first blockchain creates the authenticity. If the block and hash just created are originals they will be accepted as originals, recorded in distributed ledger and moved across the chain. BUT if you are attempting to send a block on a blockchain which is already authenticated this block will be not authenticated and discarded once it’s on the chain. Remember we said this was part of a series? We’ve already had a lot of great experts cover a wide range of storage security topics. You can access all of them at the SNIA Educational Library.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Subscribe to