Sorry, you need to enable JavaScript to visit this website.

Programming Frameworks Q&A

Alex McDonald

Nov 17, 2022

title of post
Last month, the SNIA Networking Storage Forum made sense of the “wild west” of programming frameworks, covering xPUs, GPUs and computational storage devices at our live webcast, “You’ve Been Framed! An Overview of xPU, GPU & Computational Storage Programming Frameworks.” It was an excellent overview of what’s happening in this space. There was a lot to digest, so our stellar panel of experts has taken the time to answer the questions from our live audience in this blog. Q. Why is it important to have open-source programming frameworks? A. Open-source frameworks enable community support and partnerships beyond what proprietary frameworks support. In many cases they allow ISVs and end users to write one integration that works with multiple vendors. Q. Will different accelerators require different frameworks or can one framework eventually cover them all? A. Different frameworks support accelerator attributes and specific applications. Trying to build a single framework that does the job of all the existing frameworks and covers all possible use case would be extremely complex and time consuming. In the end it would not produce the best results. Having separate frameworks that can co-exist is a more efficient and effective approach. That said, providing a well-defined hardware abstraction layer does complement different programming frameworks and can  allow one framework to support different types of accelerators. Q. Is there a benefit to standardization at the edge? A. The edge is a universal term that has many different definitions, but in this example can be referred to as a network of end points where data is generated, collected and processed. Standardization helps with developing a common foundation that can be referenced across application domains, and this can make it easier to deploy different types of accelerators at the edge. Q. Does adding a new programming framework in computational storage help to alleviate current infrastructure bottlenecks? A. The SNIA Computational Storage API and TP4091 programming framework enables a standard programming approach over proprietary methods that may be vendor limited. The computational storage value proposition significantly reduces resource constraints while the programming framework supports improved resource access at the application layer. By making it easier to deploy computational storage, these frameworks may relieve some types of infrastructure bottlenecks. Q. Do these programming frameworks typically operate at a low level or high level? A. They operate at both. The goal of programming frameworks is to operate at the application resource management level with high level command calls that can initiate underlying hardware resources. They typically engage the underlying hardware resources using lower-level APIs or drivers. Q. How does one determine which framework is best for a particular task? A. Framework selection should be addressed by which accelerator type is best suited to run the workload. Additionally, when multiple frameworks could apply, the decision on which to use would depend on the implementation details of the workload components. Multiple frameworks have been created and evolve because of this fact. There is not always a single answer to the question. The key idea motivating this webinar was to increase awareness about the frameworks available so that people can answer this question for themselves. Q. Does using an open-source framework generally give you better or worse performance than using other programming options? A. There is usually no significant performance difference between open source and proprietary frameworks, however the former is more relatively adaptable and scalable by the interested open-source community. A proprietary framework might offer better performance or access to a few more features, but usually works only with accelerators from one vendor. Q. I would like to hear more on accelerators to replace vSwitches. How are these different from NPUs?    A. Many of these accelerators include the ability to accelerate a virtual network switch (vSwitch) using purpose-built silicon as one of several tasks they can accelerate, and these accelerators are usually deployed inside a server to accelerate the networking instead of running the vSwitch on the server’s general-purpose CPU. A Network Processing Unit (NPU) is also an accelerator chip with purpose-built silicon but it typically accelerates only networking tasks and is usually deployed inside a switch, router, load balancer or other networking appliance instead of inside a server. Q. I would have liked to have seen a slide defining GPU and DPU for those new to the technology. A. SNIA has been working hard to help educate on this topic. A good starting point is our “What is an xPU” definition. There are additional resources on that page including the first webcast we did on this topic “SmartNICs to xPUs: Why is the Use of Accelerators Accelerating.”  We encourage you to check them out. Q. How do computational storage devices (CSD) deal with “data visibility” issues when the drives are abstracted behind a RAID stripe (e.g. RAID0, 5, 6). Is it expected that a CSD will never live behind such an abstraction? A. The CSD can operate as a standard drive under RAID as well as a drive with a complementary CSP (computational storage processor, re: CS Architecture Spec 1.0). If it is deployed under a RAID controller, then the RAID hardware or software would need to understand the computational capabilities of the CSD in order to take full advantage of them. Q. Are any of the major OEM storage vendors (NetApp / Dell EMC / HPE / IBM, etc.) currently offering Computational Storage capable arrays? A. A number of OEMs are offering arrays with compute resources that reside with data. The computational storage initiative that is promoted by SNIA provides a common reference architecture and programming model that may be referenced by developers and end customers.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Memory Semantics and Data Movement with CXL and SDXI

David McIntyre

Nov 11, 2022

title of post

Using software to perform memory copies has been the gold standard for applications performing memory-to-memory data movement or system memory operations. With new accelerators and memory types enriching the system architecture, accelerator-assisted memory data movement and transformation need standardization.

At the forefront of this standardization movement is the SNIA Smart Data Accelerator Interface (SDXI) which is designed as an industry-open standard that is Extensible, Forward-compatible, and Independent of I/O interconnect technology.

Adjacently, Compute Express Link™ (CXL™) is an industry-supported Cache-Coherent Interconnect for Processors, Memory Expansion, and Accelerators. CXL is designed to be an industry-open standard interface for high-speed communications, as accelerators are increasingly used to complement CPUs in support of emerging applications such as Artificial Intelligence and Machine Learning.

How are these two standards related? What are the unique advantages of each? Find out on November 30, 2022 in our next SNIA Networking Storage Forum webcast “What’s in a Name? Memory Semantics and Data Movement with CXL and SDXI” where SNIA and CXL experts working to develop these standards will:

  • Introduce SDXI and CXL.
  • Discuss data movement needs in a CXL ecosystem
  • Cover SDXI advantages in a CXL interconnect

Please join us on November 30th to learn more about these exciting technologies.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Memory Semantics and Data Movement with CXL and SDXI

David McIntyre

Nov 11, 2022

title of post
Using software to perform memory copies has been the gold standard for applications performing memory-to-memory data movement or system memory operations. With new accelerators and memory types enriching the system architecture, accelerator-assisted memory data movement and transformation need standardization. At the forefront of this standardization movement is the SNIA Smart Data Accelerator Interface (SDXI) which is designed as an industry-open standard that is Extensible, Forward-compatible, and Independent of I/O interconnect technology. Adjacently, Compute Express Link™ (CXL™) is an industry-supported Cache-Coherent Interconnect for Processors, Memory Expansion, and Accelerators. CXL is designed to be an industry-open standard interface for high-speed communications, as accelerators are increasingly used to complement CPUs in support of emerging applications such as Artificial Intelligence and Machine Learning. How are these two standards related? What are the unique advantages of each? Find out on November 30, 2022 in our next SNIA Networking Storage Forum webcast “What’s in a Name? Memory Semantics and Data Movement with CXL and SDXI” where SNIA and CXL experts working to develop these standards will:
  • Introduce SDXI and CXL.
  • Discuss data movement needs in a CXL ecosystem
  • Cover SDXI advantages in a CXL interconnect
Please join us on November 30th to learn more about these exciting technologies.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

You’ve Been Framed! An Overview of Programming Frameworks

Alex McDonald

Oct 13, 2022

title of post
With the emergence of GPUs, xPUs (DPU, IPU, FAC, NAPU, etc.) and computational storage devices for host offload and accelerated processing, a panoramic wild west of frameworks is emerging, all vying to be one of the preferred programming software stacks that best integrates the application layer with these underlying processing units. On October 26, 2022, the SNIA Networking Storage Forum will break down what’s happening in the world of frameworks in our live webcast, “You’ve Been Framed! An Overview of Programming Frameworks for xPU, GPU & Computational Storage Programming Frameworks.” We’ve convened an impressive group of experts that will provide an overview of programming frameworks that support:
  1. GPUs (CUDA, SYCL, OpenCL, oneAPI)
  2. xPUs (DASH, DOCA, OPI, IPDK)
  3. Computational Storage (SNIA computational storage API, NVMe TP4091 and FPGA programming shells)
We will discuss strengths, challenges and market adoption across these programming frameworks: • AI/ML: OpenCL, CUDA, SYCL, oneAPI • xPU: DOCA, OPI, DASH, IPDK • Core data path frameworks: SPDK, DPDK • Computational Storage: SNIA Standard 0.8 (in public review), TP4091 Register today and join us as we untangle the alphabet soup of these programming frameworks.

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

New Standard Brings Certainty to the Process of Proper Eradication of Data

Eric Hibbard

Oct 3, 2022

title of post

A wide variety of data types are recorded on a range of data storage technologies, and businesses need to ensure data residing on data storage devices and media are disposed of in a way that ensures compliance through verification of data eradication.

When media are repurposed or retired from use, the stored data often must be eliminated (sanitized) to avoid potential data breaches. Depending on the storage technology, specific methods must be employed to ensure that the data is eradicated on the logical/virtual storage and media-aligned storage in a verifiable manner.

Existing published standards such as NIST SP 800-88 Revision 1 (Media Sanitization) and ISO/IEC 27040:2015 (Information technology – Security techniques – Storage security) provide guidance on sanitization, covering storage technologies from the last decade but have not kept pace with current technology or legislative requirements.  

New standard makes conformance clearer

Recently published (August 2022), the IEEE 2883-2022 IEEE Standard for Sanitizing Storage addresses contemporary technologies as well as providing requirements that can be used for conformance purposes.

The new international standard, as with ISO/IEC 27040, defines sanitization as the ability to render access to target data on storage media infeasible for a given level of effort. IEEE 2883 is anticipated to be the go-to standard for media sanitization of modern and legacy technologies.

The IEEE 2883 standard specifies three methods for sanitizing storage: Clear, Purge, and Destruct. In addition, the standard provides technology-specific requirements and guidance for eradicating data associated with each sanitization method.

It establishes:

  • A baseline standard on how to sanitize data by media type according to accepted industry categories of Clear, Purge, and Destruct
  • Specific guidance so that organizations can trust they have achieved sanitization and can make confident conformance claims
  • Clarification around the various methods by media and type of sanitization
  • The standard is designed to be referenceable by other standards documents, such as NIST or ISO standards, so that they can also reflect the most up-to-date sanitization methods.

With this conformance clarity, particularly if widely adopted, organizations will be able to make more precise decisions around how they treat their end-of-life IT assets.

In addition, IEEE recently approved a new project IEEE P2883.1 (Recommended Practice for Use of Storage Sanitization Methods) to build on IEEE 2883-2022. Anticipated topic will cover guidance on selecting appropriate sanitization methods and verification approaches.

If you represent a data-driven organization, data security audit or certification organization, or a manufacturer of data storage technologies—you should begin preparing for these changes now.

More Information

For more information visit the IEEE 2883 – Standard for Sanitizing Storage project page. The current IEEE Standard for Sanitizing Storage is also available for purchase.

There is an IEEE webinar on Storage Sanitization – Eradicating Data in an Eco-friendly Way scheduled for October 26th. Register now.

The SNIA Storage Security Summit held in May this year covered the topic of media sanitization and the new standard and you can now view the recorded presentation.

Eric A. Hibbard, CISSP-ISSAP, ISSMP, ISSEP, CIPT, CISA, CCSK 

Chair, SNIA Security Technical Work Group & Chair; INCITS TC CS1 Cyber Security; Chair, IEEE Cybersecurity & Privacy Standards Committee (CPSC); Co-Chair, Cloud Security Alliance (CSA) – International Standardization Council (ISC); Co-Chair, American Bar Association – SciTech Law – Internet of Things (IoT) Committee

Olivia Rhye

Product Manager, SNIA

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Reaching a Computational Storage Milestone

SNIAOnStorage

Sep 1, 2022

title of post

Version 1.0 of the SNIA Computational Storage Architecture and Programming Model has just been released to the public at www.snia.org/csarch. The Model has received industry accolades, winning the Flash Memory Summit 2022 Best of Show Award for Most Innovative Memory Technology at their recent conference. Congratulations to all the companies and individuals who contributed large amounts of expertise and time to the creation of this Model. 

SNIAOnStorage sat down with SNIA Computational Storage Technical Work Group (CS TWG) co-chairs Jason Molgaard and Scott Shadley; SNIA Computational Storage Architecture and Programming Model editor Bill Martin; and SNIA Computational Storage Special Interest Group chair David McIntyre to get their perspectives on this milestone release and next steps for SNIA.

SNIAOnStorage (SOS): What is the significance of a 1.0 release for this computational storage SNIA specification?

Bill Martin (BM):  The 1.0 designation indicates that the SNIA membership has voted to approve the SNIA Computational Storage Architecture and Programming Model as an official SNIA specification.  This means that our membership believes that the architecture is something that you can develop computational storage-related products to where multiple vendor products will have similar complimentary architectures and with an industry standardized programming model.

Jason Molgaard (JM): The 1.0 release also indicates a level of maturity where companies can implement computational storage that reflects the elements of the Model.  The SNIA CS TWG took products into account when defining the Model’s reference architecture.  The Model is for everyone – even those who were not part of the 52 participating companies and 258 member representatives in the TWG – this is concrete, and they can begin development today.

SNIA Computational Storage Technical Work Group Company Members

SOS: What do you think is the most important feature of the 1.0 release?

Scott Shadley (SS):  Because we have reached the 1.0 release, there is no one specific area that makes one feature more important than anything else.  The primary difference from the last release and 1.0 was addressing the Security section. As we know, there are many new security discussions happening and we want to ensure our architecture doesn’t break or even create new security needs. Overall, all aspects are key and relevant.

JM:  I agree. The entire Model is applicable to product development and is a comprehensive and inclusive specification.  I cannot point to a single section to that is subordinate to other sections in the Model.

David McIntyre (DM):  It’s an interesting time for these three domains - compute, storage, and networking – which are beginning to merge and support each other.  The 1.0 Model has a nice baseline on definitions – before this there were none, but now we have Computational Storage Devices (CSxes), (Computational Storage Processors (CSPs), Computational Storage Drives (CSDs), and Computational Storage Arrays (CSAs)), and more; and companies can better define what is a CSP and how it connects to associated storage. Definitions help to educate and ground the ecosystems and the engineering community, and how to characterize our vendor solutions into these categories.

BM:  I would say that the four most important parts of the 1.0 Model are:  1) it defines terminology that can be used across different protocols; 2) it defines a discovery process flow for those architectures; 3) it defines security considerations for those architectures; and 4) it gives users some examples that can be used for those architectures.

SOS:  Who do you see as the audience/user for the Model?  What should these constituencies do with the Model? 

JM: The Model is useful for both hardware developers who are developing their own computational storage systems, as well as software architects, programmers, and other users to be educated on definitions and the common framework that the architecture describes for computational storage. This will enable everyone to be on the same playing field.  The intent is for everyone to have the same level of understanding and to carry on conversations with internal and external developers that are working on related projects. Now they can speak on the same plane.  Our wish is for folks to adhere to the model and follow it in their product development.  

DM: Having an industry developed reference architecture that hardware and application developers refer to is an important attribute of the 1.0 specification, especially as we get into cloud to edge deployment where standardization has not been as early.  Putting compute where data is at the edge – where data is being driven – gives the opportunity to provide normalization and standardization that application developers can refer to contributing computational storage solutions to the edge ecosystem.

SS: Version 1.0 is designed with customers to be used as a full reference document.  It is an opportunity to highlight that vendors and solutions providers are doing it in a directed and unified way.  Customers with a multi-sourcing strategy see this as something that resonates well to drive involvement with the technology.

SOS: Are there other activities within SNIA going along with the release of the Model?

BM:  The CS TWG is actively developing a Computational Storage API that will utilize the Model and provide an application programming interface for which vendors can provide a library that maps to their particular protocol, which would include the NVMe® protocol layer.

JM:  The TWG is also collaborating with the SNIA Smart Data Accelerator Interface (SDXI) Technical Work Group on how SDXI and computational storage can potentially be combined in the future.

There is a good opportunity for security to continue to be a focus of discussion in the TWG – examining the threat matrix as the Model evolves to ensure that we are not recreating or disbanding what is out there – and that we use existing solutions.

DM:  From a security standpoint the Model and the API go hand in hand as critical components far beyond the device level.  It is very important to evolve where we are today from device to solution level capabilities.  Having this group of specifications is very important to contribute to the overall ecosystem.

SOS:  Are there any industry activities going along with the release of version 1.0 of the Model?

BM:  NVM Express® is continuing their development effort on computational storage programs and Subsystems Local Memory that will provide a mechanism to implement the SNIA Architecture and Programming Model.

JM: Compute Express Link™ (CXL™) is a logical progression for computational storage from an interface perspective.  As time moves forward, we look for much work to be done in that area.

SS: We know from Flash Memory Summit 2022 that CXL is a next generation transport planned for both storage and memory devices.  CXL focuses on memory today and the high-speed transport expected there. CXL is the basically the transport beyond NVMe. One key feature of the SNIA Architecture and Programming Model is to ensure it can apply to CXL, Ethernet, or other transports as it does not dictate the transport layer that is used to talk to the Computational Storage Devices (CSxes).

DM:  Standards bodies have been siloed in the past. New opportunities of interfaces and protocols that work together harmoniously will better enable alliances to form.  Grouping of standards that work together will better support application requirements from cloud to edge.

SOS:  Any final thoughts?

BM: You may ask “Will there be a next generation of the Model?” Yes, we are currently working on the next generation with security enhancements and any other comments we get from public utilization of the Model. Comments can be sent to the SNIA Feedback Portal.

DM: We also welcome input from other industry organizations and their implementations.

BM: For example, if there are implications to the Model from work done by CXL, they could give input and the TWG would work with CXL to integrate necessary enhancements.

JM: CXL could develop new formats specific to Computational Storage.  Any new commands could still align with the model since the model is transport agnostic. 

SOS: Thanks for your time in discussing the Model.  Congratulations on the 1.0 release! And for our readers, check out these links for more information on computational storage:

Computational Storage Playlist on the SNIA Video Channel

Computational Storage in the SNIA Educational Library

SNIA Technology Focus Area – Computational Storage

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Reaching a Computational Storage Milestone

SNIAOnStorage

Sep 1, 2022

title of post
Version 1.0 of the SNIA Computational Storage Architecture and Programming Model has just been released to the public at www.snia.org/csarch. The Model has received industry accolades, winning the Flash Memory Summit 2022 Best of Show Award for Most Innovative Memory Technology at their recent conference. Congratulations to all the companies and individuals who contributed large amounts of expertise and time to the creation of this Model.  SNIAOnStorage sat down with SNIA Computational Storage Technical Work Group (CS TWG) co-chairs Jason Molgaard and Scott Shadley; SNIA Computational Storage Architecture and Programming Model editor Bill Martin; and SNIA Computational Storage Special Interest Group chair David McIntyre to get their perspectives on this milestone release and next steps for SNIA. SNIAOnStorage (SOS): What is the significance of a 1.0 release for this computational storage SNIA specification? Bill Martin (BM):  The 1.0 designation indicates that the SNIA membership has voted to approve the SNIA Computational Storage Architecture and Programming Model as an official SNIA specification.  This means that our membership believes that the architecture is something that you can develop computational storage-related products to where multiple vendor products will have similar complimentary architectures and with an industry standardized programming model. Jason Molgaard (JM): The 1.0 release also indicates a level of maturity where companies can implement computational storage that reflects the elements of the Model.  The SNIA CS TWG took products into account when defining the Model’s reference architecture.  The Model is for everyone – even those who were not part of the 52 participating companies and 258 member representatives in the TWG – this is concrete, and they can begin development today.
SNIA Computational Storage Technical Work Group Company Members
SOS: What do you think is the most important feature of the 1.0 release? Scott Shadley (SS): Because we have reached the 1.0 release, there is no one specific area that makes one feature more important than anything else.  The primary difference from the last release and 1.0 was addressing the Security section. As we know, there are many new security discussions happening and we want to ensure our architecture doesn’t break or even create new security needs. Overall, all aspects are key and relevant. JM:  I agree. The entire model is applicable to product development and is a comprehensive and inclusive specification.  I cannot point to a single section to that is subordinate to other sections in the Model. David McIntyre (DM): It’s an interesting time for these three domains – compute, storage, and networking – which are beginning to merge and support each other.  The 1.0 Model has a nice baseline on definitions – before this there were none, but now we have Computational Storage Devices (CSxes), (Computational Storage Processors (CSPs), Computational Storage Drives (CSDs), and Computational Storage Arrays (CSAs)), and more; and companies can better define what is a CSP and how it connects to associated storage. Definitions help to educate and ground the ecosystems and the engineering community, and how to characterize our vendor solutions into these categories. BM:  I would say that the four most important parts of the 1.0 Model are: 1) it defines terminology that can be used across different protocols; 2) it defines a discovery process flow for those architectures; 3) it defines security considerations for those architectures; and 4) it gives users some examples that can be used for those architectures. SOS:  Who do you see as the audience/user for the Model? What should these constituencies do with the Model?  JM: The Model is useful for both hardware developers who are developing their own computational storage systems, as well as software architects, programmers, and other users to be educated on definitions and the common framework that the architecture describes for computational storage. This will enable everyone to be on the same playing field.  The intent is for everyone to have the same level of understanding and to carry on conversations with internal and external developers that are working on related projects. Now they can speak on the same plane.  Our wish is for folks to adhere to the model and follow it in their product development. DM: Having an industry developed reference architecture that hardware and application developers refer to is an important attribute of the 1.0 specification, especially as we get into cloud to edge deployment where standardization has not been as early. Putting compute where data is at the edge – where data is being driven – gives the opportunity to provide normalization and standardization that application developers can refer to contributing computational storage solutions to the edge ecosystem. SS: Version 1.0 is designed with customers to be used as a full reference document.  It is an opportunity to highlight that vendors and solutions providers are doing it in a directed and unified way.  Customers with a multi-sourcing strategy see this as something that resonates well to drive involvement with the technology. SOS: Are there other activities within SNIA going along with the release of the Model? BM:  The CS TWG is actively developing a Computational Storage API that will utilize the Model and provide an application programming interface for which vendors can provide a library that maps to their particular protocol, which would include the NVMe® protocol layer. JM:  The TWG is also collaborating with the SNIA Smart Data Accelerator Interface (SDXI) Technical Work Group on how SDXI and computational storage can potentially be combined in the future. There is a good opportunity for security to continue to be a focus of discussion in the TWG – examining the threat matrix as the Model evolves to ensure that we are not recreating or disbanding what is out there – and that we use existing solutions. DM:  From a security standpoint the Model and the API go hand in hand as critical components far beyond the device level.  It is very important to evolve where we are today from device to solution level capabilities.  Having this group of specifications is very important to contribute to the overall ecosystem. SOS:  Are there any industry activities going along with the release of Version 1.0 of the Model? BM:  NVM Express® is continuing their development effort on computational storage programs and Subsystems Local Memory that will provide a mechanism to implement the SNIA Architecture and Programming Model. JM: Compute Express Link™ (CXL™) is a logical progression for computational storage from an interface perspective.  As time moves forward, we look for much work to be done in that area. SS: We know from Flash Memory Summit 2022 that CXL is a next generation transport planned for both storage and memory devices.  CXL focuses on memory today and the high-speed transport expected there. CXL is the basically the transport beyond NVMe. One key feature of the SNIA Architecture and Programming Model is to ensure it can apply to CXL, Ethernet, or other transports as it does not dictate the transport layer that is used to talk to the Computational Storage Devices (CSxes). DM:  Standards bodies have been siloed in the past. New opportunities of interfaces and protocols that work together harmoniously will better enable alliances to form.  Grouping of standards that work together will better support application requirements from cloud to edge. SOS:  Any final thoughts? BM: You may ask “Will there be a next generation of the Model?” Yes, we are currently working on the next generation with security enhancements and any other comments we get from public utilization of the Model. Comments can be sent to the SNIA Feedback Portal. DM: We also welcome input from other industry organizations and their implementations. BM: For example, if there are implications to the Model from work done by CXL, they could give input and the TWG would work with CXL to integrate necessary enhancements. JM: CXL could develop new formats specific to computational storage.  Any new commands could still align with the model since the model is transport agnostic. SOS: Thanks for your time in discussing the Model.  Congratulations on the 1.0 release! And for our readers, check out these links for more information on computational storage: Computational Storage Playlist on the SNIA Video Channel Computational Storage in the SNIA Educational Library SNIA Technology Focus Area – Computational Storage The post Reaching a Computational Storage Milestone first appeared on SNIA Compute, Memory and Storage Blog.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Is the Data Really Gone? A Q&A

SNIA CMSI

Aug 25, 2022

title of post

In our recent webcast Is the Data Really Gone? A Primer on the Sanitization of Storage Devices, our presenters Jonmichael Hands (Chia Network), Jim Hatfield (Seagate), and John Geldman (KIOXIA) took an in-depth look at exactly what sanitization is, what the standards are, and where sanitization is being practiced today.  If you missed it, you can watch on-demand their recommendations for the verification of sanitization to ensure that devices are meeting stringent requirements – and access the presentation slides at the SNIA Educational Library.  Here, in our Q&A blog, our experts answer more of your questions on data sanitization.

Is Over Provisioning part of the spare blocks or separate?

The main intent of an overprovisioning strategy is to resolve the asymmetric NAND behaviors of Block Erase (e.g., MBs) and Page Write (e.g., KBs) that allows efficient use of a NAND die’s endurance capability, in other words, it is a store-over capability that is regularly used leaving older versions of a Logical Block Addressing (LBA) in media until it is appropriate to garbage collect.

Spares are a subset of overprovisioning and a spare block strategy is different than an overprovisioning strategy. The main intent of a spare strategy is a failover capability mainly used on some kind of failure (this can be a temporary vibration issue on a hard disk drive or a bad sector).

The National Institute of Standards and Technology (NIST) mentions the NVMe® Format with Secure Erase Settings to 1 for User Data erase or 2 for Crypto as a purge method. From what I can gather the sanitize was more a fallout of the format rather than anything that was designed. With the NVMe sanitize would you expect the Format with the Data Erasure options to be depreciated or moved back to a clear?

The Format NVM command does have a crypto erase, but it is entirely unspecified, vendor specific, and without any requirements. It is not to be trusted. Sanitize, however, can be trusted, has specific TESTABLE requirements, and is sanctioned by IEEE 2883.

The Format NVM command was silent on some requirements that are explicit in both NVMe Sanitize commands and IEEE 2883. It was possible, but not required for a NVME Format with Secure Erase Settings set to Crypto to also purge other internal buffers. Such behavior beyond the specification is vendor specific. Without assurance from the vendor, be wary of assuming the vendor made additional design efforts. The NVMe Sanitize command does meet the requirements of purge as defined in IEEE 2883.

My question is around logical (file-level, OS/Filesystem, Logical volumes, not able to apply to physical DDMs): What can be done at the technical level and to what degree that it is beyond what modern arrays can do (e.g., too many logical layers) and thus, that falls under procedural controls. Can you comment on regulatory alignment with technical (or procedural) acceptable practices?

The IEEE Security in Storage Working Group (SISWG) has not had participation by subject matter experts for this, and therefore has not made any requirements or recommendations, and acceptable practices. Should such experts participate, we can consider requirements and recommendations and acceptable practices.

Full verification is very expensive especially if you are doing lots of drives simultaneously. Why can't you seed like you could do for crypto, verify the seeding is gone, and then do representative sampling?

The problem with seeding before crypto erase is that you don’t know the before and after data to actually compare with. Reading after crypto erase returns garbage…. but you don’t know if it is the right garbage.  In addition, in some implementations, doing a crypto erase also destroys the CRC/EDC/ECC information making the data unreadable after crypto erase.

Seeding is not a common defined term. If what was intended by seeding was writing known values into known locations, be aware that there are multiple problems with that process. Consider an Overwrite Sanitize operation. Such an operation writes the same pattern into every accessible and non-accessible block. That means that the device is completely written with no free media (even the overprovisioning has that pattern). For SSDs, a new write into that device has to erase data before it can be re-written. This lack of overprovisioned data in SSDs results in artificial accelerated endurance issues.

A common solution implemented by multiple companies is to de-allocate after sanitization. After a de-allocation, a logical block address will not access physical media until that logical block address is written by the host. This means that even if known data was written before sanitize, and if the sanitize did not do its job, then the read-back will not return the data from the physical media that used to be allocated to that address (i.e., that physical block is de-allocated) so the intended test will not be effective.

Are there other problems with Sanitize?

Another problem with Sanitize is that internal protection information (e.g., CRC data, Integrity Check data, and Error Correction Code data) have also been neutralized until that block is written again with new data. Most SSDs are designed to never return bad data (e.g., data that fails Integrity Checks) as a protection and reliability feature.

What are some solutions for Data Sanitization?

One solution that has been designed into NVMe is for the vendor to support a full overwrite of media after a crypto erase or a block erase sanitize operation. Note that such an overwrite has unpopular side-effects as the overwrite:

  1. changes any result of the actual sanitize operation;
  2. may take a significant time (e.g., multiple days); and
  3. still requires a full-deallocation by the host to make the device useful again.

A unique complication for a Block Erase sanitization operation that leaves NAND in an erased state is not stable at the NAND layer, so a full write of deallocated media can be scheduled to be done over time, or the device can be designed to complete an overwrite before the sanitize operation returns a completion. In any/either case, the media remains deallocated until the blocks are written by the host.

Can you kindly clarify DEALLOCATE all storage before leaving sanitize ? What does that mean physically?

Deallocation (by itself) is not acceptable for sanitization. It is allowable AFTER a proper and thorough sanitization has taken place. Also, in some implementations, reading a deallocated logical block results in a read error. Deallocation must be USED WITH CAUTION. There are many knobs and switches to set to do it right.

Deallocation means removing the internal addressing that mapped a logical block to a physical block. After deallocation, media is not accessed so the read of a logical block address provides no help in determining if the media was actually sanitized or not. Deallocation gives as factory-fresh out of the box performance as is possible.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Is the Data Really Gone? A Q&A

SNIA CMS Community

Aug 25, 2022

title of post
In our recent webcast Is the Data Really Gone? A Primer on the Sanitization of Storage Devices, our presenters Jonmichael Hands (Chia Network), Jim Hatfield (Seagate), and John Geldman (KIOXIA) took an in-depth look at exactly what sanitization is, what the standards are, and where sanitization is being practiced today.  If you missed it, you can watch on-demand their recommendations for the verification of sanitization to ensure that devices are meeting stringent requirements – and access the presentation slides at the SNIA Educational Library.  Here, in our Q&A blog, our experts answer more of your questions on data sanitization. Is Over Provisioning part of the spare blocks or separate? The main intent of an overprovisioning strategy is to resolve the asymmetric NAND behaviors of Block Erase (e.g., MBs) and Page Write (e.g., KBs) that allows efficient use of a NAND die’s endurance capability, in other words, it is a store-over capability that is regularly used leaving older versions of a Logical Block Addressing (LBA) in media until it is appropriate to garbage collect. Spares are a subset of overprovisioning and a spare block strategy is different than an overprovisioning strategy. The main intent of a spare strategy is a failover capability mainly used on some kind of failure (this can be a temporary vibration issue on a hard disk drive or a bad sector). The National Institute of Standards and Technology (NIST) mentions the NVMe® Format with Secure Erase Settings to 1 for User Data erase or 2 for Crypto as a purge method. From what I can gather the sanitize was more a fallout of the format rather than anything that was designed. With the NVMe sanitize would you expect the Format with the Data Erasure options to be depreciated or moved back to a clear? The Format NVM command does have a crypto erase, but it is entirely unspecified, vendor specific, and without any requirements. It is not to be trusted. Sanitize, however, can be trusted, has specific TESTABLE requirements, and is sanctioned by IEEE 2883. The Format NVM command was silent on some requirements that are explicit in both NVMe Sanitize commands and IEEE 2883. It was possible, but not required for a NVME Format with Secure Erase Settings set to Crypto to also purge other internal buffers. Such behavior beyond the specification is vendor specific. Without assurance from the vendor, be wary of assuming the vendor made additional design efforts. The NVMe Sanitize command does meet the requirements of purge as defined in IEEE 2883. My question is around logical (file-level, OS/Filesystem, Logical volumes, not able to apply to physical DDMs): What can be done at the technical level and to what degree that it is beyond what modern arrays can do (e.g., too many logical layers) and thus, that falls under procedural controls. Can you comment on regulatory alignment with technical (or procedural) acceptable practices? The IEEE Security in Storage Working Group (SISWG) has not had participation by subject matter experts for this, and therefore has not made any requirements or recommendations, and acceptable practices. Should such experts participate, we can consider requirements and recommendations and acceptable practices. Full verification is very expensive especially if you are doing lots of drives simultaneously. Why can’t you seed like you could do for crypto, verify the seeding is gone, and then do representative sampling? The problem with seeding before crypto erase is that you don’t know the before and after data to actually compare with. Reading after crypto erase returns garbage…. but you don’t know if it is the right garbage. In addition, in some implementations, doing a crypto erase also destroys the CRC/EDC/ECC information making the data unreadable after crypto erase. Seeding is not a common defined term. If what was intended by seeding was writing known values into known locations, be aware that there are multiple problems with that process. Consider an Overwrite Sanitize operation. Such an operation writes the same pattern into every accessible and non-accessible block. That means that the device is completely written with no free media (even the overprovisioning has that pattern). For SSDs, a new write into that device has to erase data before it can be re-written. This lack of overprovisioned data in SSDs results in artificial accelerated endurance issues. A common solution implemented by multiple companies is to de-allocate after sanitization. After a de-allocation, a logical block address will not access physical media until that logical block address is written by the host. This means that even if known data was written before sanitize, and if the sanitize did not do its job, then the read-back will not return the data from the physical media that used to be allocated to that address (i.e., that physical block is de-allocated) so the intended test will not be effective. Are there other problems with Sanitize? Another problem with Sanitize is that internal protection information (e.g., CRC data, Integrity Check data, and Error Correction Code data) have also been neutralized until that block is written again with new data. Most SSDs are designed to never return bad data (e.g., data that fails Integrity Checks) as a protection and reliability feature. What are some solutions for Data Sanitization? One solution that has been designed into NVMe is for the vendor to support a full overwrite of media after a crypto erase or a block erase sanitize operation. Note that such an overwrite has unpopular side-effects as the overwrite:
  1. changes any result of the actual sanitize operation;
  2. may take a significant time (e.g., multiple days); and
  3. still requires a full-deallocation by the host to make the device useful again.
A unique complication for a Block Erase sanitization operation that leaves NAND in an erased state is not stable at the NAND layer, so a full write of deallocated media can be scheduled to be done over time, or the device can be designed to complete an overwrite before the sanitize operation returns a completion. In any/either case, the media remains deallocated until the blocks are written by the host. Can you kindly clarify DEALLOCATE all storage before leaving sanitize ? What does that mean physically? Deallocation (by itself) is not acceptable for sanitization. It is allowable AFTER a proper and thorough sanitization has taken place. Also, in some implementations, reading a deallocated logical block results in a read error. Deallocation must be USED WITH CAUTION. There are many knobs and switches to set to do it right. Deallocation means removing the internal addressing that mapped a logical block to a physical block. After deallocation, media is not accessed so the read of a logical block address provides no help in determining if the media was actually sanitized or not. Deallocation gives as factory-fresh out of the box performance as is possible. The post Is the Data Really Gone? A Q&A first appeared on SNIA Compute, Memory and Storage Blog.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Kubernetes is Everywhere Q&A

Alex McDonald

Aug 3, 2022

title of post
Earlier this month, the SNIA Cloud Storage Technologies Initiative hosted a fascinating panel discussion “Kubernetes is Everywhere: What About Cloud Native Storage?”  where storage experts from SNIA and Kubernetes experts from the Cloud Native Computing Foundation (CNCF) discussed storage implications for Kubernetes. It was a lively and enlightening discussion on key considerations for container storage. In this Q&A blog, our panelists Nick Connolly, Michael St-Jean, Pete Brey and I elaborate on some of the most intriguing questions during the session. Q. What are the additional/different challenges for Kubernetes storage at the edge – in contrast to the data center?   A. Edge means different things depending on context. It could mean enterprise or provider edge locations, which are typically characterized by smaller, compact deployments of Kubernetes. It could mean Kubernetes deployed on a single node at a site with little or no IT support, or even disconnected from the internet, on ships, oil rigs, or even in space for example. It can also mean device edge, like MicroShift running on a small form factor computer or within an ARM or FPGA card for example. One big challenge for Kubernetes at the edge in general is to provide a lightweight deployment. Added components, like container-native storage, are required for many edge applications, but they take up resources. Therefore, the biggest challenge is to deploy the storage resources that are necessary for the workload, but at the same time, making sure your footprint is appropriate for the deployment infrastructure. For example, there are deployments for container storage for compact edge clusters, and there is work taking place on single-node deployments. Another emerging technology is to use data mirroring, data caching, and data federation technologies to provide access between edge devices and enterprise edge deployments or deployment in the cloud or datacenter. Q. What does Container Native Storage mean - how does that differ from a SAN?  A. Container-native storage includes Kubernetes services that allows for dynamic and static provisioning, Day 1 and Day 2 operations and management, and additional data services like security, governance, resiliency and data discovery that must be deployed in context of the Kubernetes cluster. A SAN could be connected to a cluster via a Container Storage Interface (CSI), however it would typically not have all the capabilities provided by a container-native storage solution. Some container-native storage solutions, however, can use an underlying SAN or NAS device to provide the core storage infrastructure, while at the same time, deliver the Kubernetes-aware services required by the cluster. In this way, organizations can make use of existing infrastructure, protecting their investment, while still get the Kubernetes services that are required by applications and workload running in the cluster. Q. You mention that Kubernetes does a good job of monitoring applications and keeping them up and running, but how does it prevent split-brain action on the storage when that happens? A. This will be a function provided by the container-native storage provider. The storage service will include some type of arbiter for data in order to prevent split-brain. For example, a monitor within the software-defined storage subsystem may maintain a cluster map and state of the environment in order to provide distributed decision-making. Monitors would typically be configured in an odd number, 3 or 5, depending on the size and the topology of the cluster, to prevent split-brain situations. Monitors are not in the data-path and do not serve IO requests to and from the clients.  Q. So do I need to go and buy a whole new infrastructure for this or can I use my existing SAN? A. Some container-native storage solutions can use existing storage infrastructure, so typically you are able to protect your investment in existing capital infrastructure purchases while gaining the benefits of the Kubernetes data services required by the cluster and applications. Q. How can I keep my data secure in a multi-tenanted environment? A. There are concerns about data security that are answered by the container-native storage solution, however integration of these services should be taken into consideration with other security tools delivered for Kubernetes environments. For example, you should consider the container-native storage solution’s ability to provide encryption for data at rest, as well as data in motion. Cluster-wide encryption should be a default requirement; however, you may also want to encrypt data from one tenant (application) to another. This would require volume-level encryption, and you would want to make sure your provider has an algorithm that creates different keys on clones and snapshots. You should also consider where your encryption keys are stored. Using a storage solution that is integrated with an external key management system protects against hacks within the cluster. For additional data security, it is useful to review the solution architecture, what the underlying operating system kernel protects, and how its cryptography API is utilized by the storage software. Full integration with your Kubernetes distribution authentication process is also important. In recent years, Ransomeware attacks have also become prevalent. While some systems attempt to protect against Ransomeware attacks, the best advice is to make sure you have proper encryption on your data, and that you have a substantial Data Protection and Disaster Recovery strategy in place. Data Protection in a Kubernetes environment is slightly more complex than in a typical datacenter because the state of an application running in Kubernetes is held by the persistent storage claim. When you back up your data, you must have cluster-aware APIs in your Data Protection solution that is able to capture context with the cluster and the application with which it is associated. Some of those APIs may be available as part of your container-native storage deployment and integrated with your existing datacenter backup and recovery solution. Additional business continuity strategies, like metropolitan and regional disaster recovery clusters can also be attained. Integration with multi-cluster control plane solutions that work with your chosen Kubernetes distribution can help facilitate a broad business continuity strategy. Q: What’s the difference between data access modes and data protocols? A: You create a persistent volume (or PV) based on the type of storage you have. That storage will typically support one or more data protocols. For example, you might have storage set up as a NAS supporting NFS and SMB protocols. So, you have file protocols, and you might have a SAN set up to support your databases which run a block protocol, or you might have a distributed storage system with a data lake or archive that is running object protocols, or it could be running all three protocols in separate storage pools. In Kubernetes, you’ll have access to these PVs, and when a user needs storage, they will ask for a Persistent Volume Claim (or PVC) for their project. Alternatively, some systems support an Object Bucket Claim as well. In any case, when you make that claim request, you do so based on Storage classes with different access modes, RWO (read-write once where the volume can be mounted as read-write by a single node), RWX (read-write many. This is where the volume can be mounted as read-write by many nodes.), and ROX (read only many - The volume can be mounted as read-only by many nodes.) Different types of storage APIs are able to support those different access modes. For example, a block protocol, like EBS or Cinder, would support RWO. A filesystem like Azure File or Manilla would support RWX. NFS would support all 3 access modes. Q. What are object bucket claims and namespace buckets? A.Object bucket claims are analogous to PVCs mentioned above, except that they are the method for provisioning and accessing object storage within Kubernetes projects using a Storage Class. Because the interface for object storage is different than for block or file storage, there is a separate Kubernetes standard called COSI. Typically, a user wanting to mount an object storage pool would connect through an S3 RESTful protocol. Namespace buckets are used more for data federation across environments. So you could have a namespace bucket deployed with the backend data on AWS, for example, and it can be accessed and read by clients running in Kubernetes clusters elsewhere, like on Azure, in the datacenter or at the edge. Q. Why is backup and recovery listed as a feature of container-native storage? Can’t I just use my datacenter data protection solution? A. As we mentioned, containers are by nature ephemeral. So if you lose your application, or the cluster, the state of that application is lost. The state of your application in Kubernetes is held by the persistent storage associated with that app. So, when you backup your data, it needs to be in context of the application and the overall cluster resources so when you restore, there are APIs to recover the state of the pod. Some enterprise data protection solutions include cluster aware APIs and they can be used to extend your datacenter data protection to your Kubernetes environment. Notably, IBM Spectrum Protect Plus, Dell PowerProtect, Veritas etc. There are also Kubernetes-specific data protection solutions like Kasten by Veeam, Trilio, Bacula. You may be able to use your existing enterprise solution… just be sure to check to see if they are supporting cluster-aware Kubernetes APIs with their product. Q. Likewise, what is different about planning disaster recovery for Kubernetes? A. Similar to the backup/recovery discussion, since the state of the applications is held by the persistent storage layer, the failure and recovery needs to include cluster aware APIs, but beyond that, if you are trying to recover to another cluster, you’ll need a control plane that manages resources across clusters. Disaster recovery really becomes a question about your recovery point objectives, and your recovery time objectives. It could be as simple as backing up everything to tape every night and shipping those tapes to another region. Of course, your recovery point might be a full day, and your recovery time will vary depending on whether you have a live cluster to recover to, etc. You could also have a stretch cluster, which is a cluster which has individual nodes that are physically separated across failure domains. Typically, you need to be hyper-conscious of your network capabilities because if you are going to stretch your cluster across a campus or city, for example, you could degrade performance considerably without the proper network bandwidth and latency. Other options such as synchronous metro DR or asynchronous regional DR can be adopted, but your ability to recover, or your recovery time objective will depend a great deal on the degree of automation you can build in for the recovery. Just be aware, and do your homework, as to what control plane tools are available and how they integrate with the storage system you’ve chosen and ensure that they align to your recovery time objectives. Q. What’s the difference between cluster-level encryption and volume-level encryption in this context? A. For security, you’ll want to make sure that your storage solution supports encryption. Cluster-wide encryption is at the device level and protects against external breaches. As an advanced feature, some solutions provide volume-level encryption as well. This protects individual applications or tenants from others within the cluster. Encryption keys are created and can be stored within the cluster, but then those with cluster access could hack those keys, so support for integration with an external key management system is also preferable to enhance security. Q. What about some of these governance requirements like SEC, FINRA, GDPR? How does container-native storage help? A. This is really a question about the security factors of your storage system. GDPR has a lot of governance requirements and ensuring that you have proper security and encryption in place in case data is lost is a key priority. FINRA is more of a US financial brokerage regulation working with the Securities and Exchange Commission. Things like data immutability may be an important feature for financial organizations. Other agencies, like the US government, have encryption requirements like FIPS which certifies cryptography APIs within an operating system kernel. Some storage solutions that make use of those crypto APIs would be better suited for particular use cases. So, it’s not really a question of your storage being certified by any of these regulatory committees, but rather ensuring that your persistent storage layer integrated with Kubernetes does not break compliance of the overall solution. Q. How is data federation used in Kubernetes? A. Since Kubernetes offers an orchestration and management platform that can be delivered across many different infrastructures, whether on-prem, on a public or private cloud, etc., being able to access and read data from a single source from across Kubernetes clusters on top of differing infrastructures provides a huge advantage for multi and hybrid cloud deployments. There are also tools that allow you to federate SQL queries across different storage platforms, whether they are in Kubernetes or not. Extending your reach to data existing off-cluster helps build data insights through analytics engines and provides data discovery for machine learning model management.  Q. What tools differentiate data acquisition and preparation in Kubernetes? A. Ingesting data from edge devices or IoT into Kubernetes can allow data engineers to create automated data pipelines. Using some tools within Kubernetes, like Knative, allows engineers to create triggered events spawning off applications within the system that further automates workflows. Additional tools, like bucket notifications, and Kafka streams, can help with the movement, manipulation, and enhancement of data within the workstream. A lot of organizations are using distributed application workflows to build differentiated use cases using Kubernetes.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Subscribe to