Blog

Kubernetes is Everywhere Q&A

Kubernetes is Everywhere Q&A

Aug 3, 2022

Earlier this month, the SNIA Cloud Storage Technologies Initiative hosted a fascinating panel discussion “Kubernetes is Everywhere: What About Cloud Native Storage?” where storage experts from SNIA and Kubernetes experts from the Cloud Native Computing Foundation (CNCF) discussed storage implications for Kubernetes. It was a lively and enlightening discussion on key considerations for container storage. In this Q&A blog, our panelists Nick Connolly, Michael St-Jean, Pete Brey and I elaborate on some of the most intriguing questions during the session. Q. What are the additional/different challenges for Kubernetes storage at the edge – in contrast to the data center? A. Edge means different things depending on context. It could mean enterprise or provider edge locations, which are typically characterized by smaller, compact deployments of Kubernetes. It could mean Kubernetes deployed on a single node at a site with little or no IT support, or even disconnected from the internet, on ships, oil rigs, or even in space for example. It can also mean device edge, like MicroShift running on a small form factor computer or within an ARM or FPGA card for example. One big challenge for Kubernetes at the edge in general is to provide a lightweight deployment. Added components, like container-native storage, are required for many edge applications, but they take up resources. Therefore, the biggest challenge is to deploy the storage resources that are necessary for the workload, but at the same time, making sure your footprint is appropriate for the deployment infrastructure. For example, there are deployments for container storage for compact edge clusters, and there is work taking place on single-node deployments. Another emerging technology is to use data mirroring, data caching, and data federation technologies to provide access between edge devices and enterprise edge deployments or deployment in the cloud or datacenter. Q. What does Container Native Storage mean – how does that differ from a SAN? A. Container-native storage includes Kubernetes services that allows for dynamic and static provisioning, Day 1 and Day 2 operations and management, and additional data services like security, governance, resiliency and data discovery that must be deployed in context of the Kubernetes cluster. A SAN could be connected to a cluster via a Container Storage Interface (CSI), however it would typically not have all the capabilities provided by a container-native storage solution. Some container-native storage solutions, however, can use an underlying SAN or NAS device to provide the core storage infrastructure, while at the same time, deliver the Kubernetes-aware services required by the cluster. In this way, organizations can make use of existing infrastructure, protecting their investment, while still get the Kubernetes services that are required by applications and workload running in the cluster. Q. You mention that Kubernetes does a good job of monitoring applications and keeping them up and running, but how does it prevent split-brain action on the storage when that happens? A. This will be a function provided by the container-native storage provider. The storage service will include some type of arbiter for data in order to prevent split-brain. For example, a monitor within the software-defined storage subsystem may maintain a cluster map and state of the environment in order to provide distributed decision-making. Monitors would typically be configured in an odd number, 3 or 5, depending on the size and the topology of the cluster, to prevent split-brain situations. Monitors are not in the data-path and do not serve IO requests to and from the clients. Q. So do I need to go and buy a whole new infrastructure for this or can I use my existing SAN? A. Some container-native storage solutions can use existing storage infrastructure, so typically you are able to protect your investment in existing capital infrastructure purchases while gaining the benefits of the Kubernetes data services required by the cluster and applications. Q. How can I keep my data secure in a multi-tenanted environment? A. There are concerns about data security that are answered by the container-native storage solution, however integration of these services should be taken into consideration with other security tools delivered for Kubernetes environments. For example, you should consider the container-native storage solution’s ability to provide encryption for data at rest, as well as data in motion. Cluster-wide encryption should be a default requirement; however, you may also want to encrypt data from one tenant (application) to another. This would require volume-level encryption, and you would want to make sure your provider has an algorithm that creates different keys on clones and snapshots. You should also consider where your encryption keys are stored. Using a storage solution that is integrated with an external key management system protects against hacks within the cluster. For additional data security, it is useful to review the solution architecture, what the underlying operating system kernel protects, and how its cryptography API is utilized by the storage software. Full integration with your Kubernetes distribution authentication process is also important. In recent years, Ransomeware attacks have also become prevalent. While some systems attempt to protect against Ransomeware attacks, the best advice is to make sure you have proper encryption on your data, and that you have a substantial Data Protection and Disaster Recovery strategy in place. Data Protection in a Kubernetes environment is slightly more complex than in a typical datacenter because the state of an application running in Kubernetes is held by the persistent storage claim. When you back up your data, you must have cluster-aware APIs in your Data Protection solution that is able to capture context with the cluster and the application with which it is associated. Some of those APIs may be available as part of your container-native storage deployment and integrated with your existing datacenter backup and recovery solution. Additional business continuity strategies, like metropolitan and regional disaster recovery clusters can also be attained. Integration with multi-cluster control plane solutions that work with your chosen Kubernetes distribution can help facilitate a broad business continuity strategy. Q: What’s the difference between data access modes and data protocols? A: You create a persistent volume (or PV) based on the type of storage you have. That storage will typically support one or more data protocols. For example, you might have storage set up as a NAS supporting NFS and SMB protocols. So, you have file protocols, and you might have a SAN set up to support your databases which run a block protocol, or you might have a distributed storage system with a data lake or archive that is running object protocols, or it could be running all three protocols in separate storage pools. In Kubernetes, you’ll have access to these PVs, and when a user needs storage, they will ask for a Persistent Volume Claim (or PVC) for their project. Alternatively, some systems support an Object Bucket Claim as well. In any case, when you make that claim request, you do so based on Storage classes with different access modes, RWO (read-write once where the volume can be mounted as read-write by a single node), RWX (read-write many. This is where the volume can be mounted as read-write by many nodes.), and ROX (read only many – The volume can be mounted as read-only by many nodes.) Different types of storage APIs are able to support those different access modes. For example, a block protocol, like EBS or Cinder, would support RWO. A filesystem like Azure File or Manilla would support RWX. NFS would support all 3 access modes. Q. What are object bucket claims and namespace buckets? A.Object bucket claims are analogous to PVCs mentioned above, except that they are the method for provisioning and accessing object storage within Kubernetes projects using a Storage Class. Because the interface for object storage is different than for block or file storage, there is a separate Kubernetes standard called COSI. Typically, a user wanting to mount an object storage pool would connect through an S3 RESTful protocol. Namespace buckets are used more for data federation across environments. So you could have a namespace bucket deployed with the backend data on AWS, for example, and it can be accessed and read by clients running in Kubernetes clusters elsewhere, like on Azure, in the datacenter or at the edge. Q. Why is backup and recovery listed as a feature of container-native storage? Can’t I just use my datacenter data protection solution? A. As we mentioned, containers are by nature ephemeral. So if you lose your application, or the cluster, the state of that application is lost. The state of your application in Kubernetes is held by the persistent storage associated with that app. So, when you backup your data, it needs to be in context of the application and the overall cluster resources so when you restore, there are APIs to recover the state of the pod. Some enterprise data protection solutions include cluster aware APIs and they can be used to extend your datacenter data protection to your Kubernetes environment. Notably, IBM Spectrum Protect Plus, Dell PowerProtect, Veritas etc. There are also Kubernetes-specific data protection solutions like Kasten by Veeam, Trilio, Bacula. You may be able to use your existing enterprise solution… just be sure to check to see if they are supporting cluster-aware Kubernetes APIs with their product. Q. Likewise, what is different about planning disaster recovery for Kubernetes? A. Similar to the backup/recovery discussion, since the state of the applications is held by the persistent storage layer, the failure and recovery needs to include cluster aware APIs, but beyond that, if you are trying to recover to another cluster, you’ll need a control plane that manages resources across clusters. Disaster recovery really becomes a question about your recovery point objectives, and your recovery time objectives. It could be as simple as backing up everything to tape every night and shipping those tapes to another region. Of course, your recovery point might be a full day, and your recovery time will vary depending on whether you have a live cluster to recover to, etc. You could also have a stretch cluster, which is a cluster which has individual nodes that are physically separated across failure domains. Typically, you need to be hyper-conscious of your network capabilities because if you are going to stretch your cluster across a campus or city, for example, you could degrade performance considerably without the proper network bandwidth and latency. Other options such as synchronous metro DR or asynchronous regional DR can be adopted, but your ability to recover, or your recovery time objective will depend a great deal on the degree of automation you can build in for the recovery. Just be aware, and do your homework, as to what control plane tools are available and how they integrate with the storage system you’ve chosen and ensure that they align to your recovery time objectives. Q. What’s the difference between cluster-level encryption and volume-level encryption in this context? A. For security, you’ll want to make sure that your storage solution supports encryption. Cluster-wide encryption is at the device level and protects against external breaches. As an advanced feature, some solutions provide volume-level encryption as well. This protects individual applications or tenants from others within the cluster. Encryption keys are created and can be stored within the cluster, but then those with cluster access could hack those keys, so support for integration with an external key management system is also preferable to enhance security. Q. What about some of these governance requirements like SEC, FINRA, GDPR? How does container-native storage help? A. This is really a question about the security factors of your storage system. GDPR has a lot of governance requirements and ensuring that you have proper security and encryption in place in case data is lost is a key priority. FINRA is more of a US financial brokerage regulation working with the Securities and Exchange Commission. Things like data immutability may be an important feature for financial organizations. Other agencies, like the US government, have encryption requirements like FIPS which certifies cryptography APIs within an operating system kernel. Some storage solutions that make use of those crypto APIs would be better suited for particular use cases. So, it’s not really a question of your storage being certified by any of these regulatory committees, but rather ensuring that your persistent storage layer integrated with Kubernetes does not break compliance of the overall solution. Q. How is data federation used in Kubernetes? A. Since Kubernetes offers an orchestration and management platform that can be delivered across many different infrastructures, whether on-prem, on a public or private cloud, etc., being able to access and read data from a single source from across Kubernetes clusters on top of differing infrastructures provides a huge advantage for multi and hybrid cloud deployments. There are also tools that allow you to federate SQL queries across different storage platforms, whether they are in Kubernetes or not. Extending your reach to data existing off-cluster helps build data insights through analytics engines and provides data discovery for machine learning model management. Q. What tools differentiate data acquisition and preparation in Kubernetes? A. Ingesting data from edge devices or IoT into Kubernetes can allow data engineers to create automated data pipelines. Using some tools within Kubernetes, like Knative, allows engineers to create triggered events spawning off applications within the system that further automates workflows. Additional tools, like bucket notifications, and Kafka streams, can help with the movement, manipulation, and enhancement of data within the workstream. A lot of organizations are using distributed application workflows to build differentiated use cases using Kubernetes.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Cloud Storage

Blog

Join Us as We Return Live to FMS!

Join Us as We Return Live to FMS!

SNIA CMS Community

Jul 29, 2022

SNIA is pleased to be part of the Flash Memory Summit 2022 agenda August 1-4, 2022 at the Santa Clara CA Convention Center, with our volunteer leadership demonstrating solutions, chairing and speaking in sessions, and networking with FMS attendees at a variety of venues during the conference. The ever-popular SNIA Reception at FMS features the SNIA groups Storage Management Initiative, Compute Memory and Storage Initiative, and Green Storage Initiative, along with SNIA alliance partners CXL Consortium, NVM Express, and OpenFabrics Alliance. Stop by B-203/204 at the Convention Center from 5:30 – 7:00 pm Monday August 1 for refreshments and networking with colleagues to kick off the week! You won’t want to miss SNIA’s mainstage presentation on Wednesday August 3 at 2:40 pm in the Mission City Ballroom. SNIA Vice Chair Richelle Ahlvers of Intel will provide a perspective on how new storage technologies and trends are accelerating through standards and open communities. In the Exhibit Hall, SNIA Storage Management Initiative and Compute Memory and Storage Initiative are FMS Platinum sponsors with a SNIA Demonstration Pavilion at booth #725. During exhibit hours Tuesday evening through Thursday afternoon, 15 SNIA member companies will be featured in live technology demonstrations on storage management, computational storage, persistent memory, sustainability, and form factors; a Persistent Memory Programming Workshop and Hackathon; and theater presentations on SNIA’s standards and alliance work. Long standing SNIA technology focus areas in computational storage and memory will be represented in the SNIA sponsored System Architectures Track (SARC for short) – Tuesday for memory and Thursday for computational storage. SNIA is also pleased to sponsor a day on CXL architectures, memory, and storage talks on Wednesday. These sessions will all be in Ballroom G. A new Sustainability Track on Thursday morning in Ballroom A led by the SNIA Green Storage Technical Work Group includes presentations on SSD power management, real world applications and storage workloads, and a carbon footprint comparison of SSDs vis HDDs, followed by a panel discussion. SSDs will also be featured in two SNIA-led presentation/panel pairs – SSDS-102-1 and 102-2 Ethernet SSDs on Tuesday afternoon in Ballroom B and SSDS-201-1 and 201-2 EDSFF E1 and E3 form factors on Wednesday morning in Ballroom D. SNIA Swordfish will be discussed in the DCTR-102-2 Enterprise Storage Part 2 session in Ballroom D on Tuesday morning And the newest SNIA technical work group – DNA Data Storage– will lead a new-to-2022 FMS track on Thursday morning in Great America Meeting Room 2, discussing topics like preservation of DNA for information storage, the looming need for molecular storage, and DNA sequencing at scale. Attendees can engage for questions and discussion in Part 2 of the track. Additional ways to network with SNIA colleagues include the always popular chat with the experts – beer and pizza on Tuesday evening, sessions on cloud storage, artificial intelligence, blockchain, and an FMS theater presentation on real world storage workloads. Full details on session times, locations, chairs and speakers for all these exciting FMS activities can be found at www.snia.org/fms and on the Flash Memory Summit website. SNIA colleagues and friends can register for $100.00 off the full conference or single day packages using the code SNIA22 at www.flashmemorysummit.com. The post Join Us as We Return Live to FMS! first appeared on SNIA Compute, Memory and Storage Blog.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Computational Storage Solid State Storage

Blog

A Deep Dive on xPU Deployment and Solutions

A Deep Dive on xPU Deployment and Solutions

John Kim

Jul 25, 2022

A Deep Dive on xPU Deployment and Solutions Our first and second webcasts in this xPU webcast series explained what xPUs are, how they work, and what they can do. If by you missed them, they are available to watch here in the SNIA Educational Library. On August 24, 2022, the SNIA Networking Storage Forum will host the third webcast in this series, “xPU Deployment and Solutions Deep Dive,” where our xPU experts will explain next steps for deployments, discussing: When to Deploy:

Pros and cons of dedicated accelerator chips versus running everything on the CPU
- xPU use cases across hybrid, multi-cloud and edge environments
- Cost and power considerations

Where to Deploy:

Deployment operating models: Edge, Core Data Center, CoLo, or Public Cloud
- System location: In the server, with the storage, on the network, or in all those locations

How to Deploy:

Mapping workloads to hyperconverged and disaggregated infrastructure
- Integrating xPUs into workload flows
- Applying offload and acceleration elements within an optimized solution

Register here. There’s no excuse to miss this informative webcast and discussion. We hope you will join us live on August 24^th as we wade deeper into the xPU waters!

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

DPU DPU

Blog

SNIA Experts Answer Questions on xPU Accelerator Offload Functions

SNIA Experts Answer Questions on xPU Accelerator Offload Functions

David McIntyre

Jul 14, 2022

Q. Since xPUs can see traffic on the host doesn't that widen the surface area for exposure if it were to be compromised?

A. There is another aspect of this question: It depends on who owns control of the device and who's allowed to run software there. If the organization that runs the infrastructure owns the xPU and controls the software that goes on there, then it can act as a security boundary to the rest of the host which might be running user software or other kinds of software. So, you can use the xPU as a security check in a security boundary and it actually could reduce the total attack surface or provide better security isolation. If you open up the xPU to be just another general-purpose micro server, then it has effectively the same attack surface as the hosting system, but you could run it in a mode or control it in a mode where it actually reduced the total attack service and make that a security boundary. That's one of the interesting notions that's come out in the industry on how xPUs can provide value.

Q. Before, the host internal-only traffic was only exposed if the host was compromised, but now if the xPU is compromised it might exfiltrate information without the host being aware. Cuts both ways - I get that it is a hardened domain.... but everything gets compromised eventually.

A. Any programmable offload engine or hypervisor in a deployment has this same consideration. The xPU is most similar to a hypervisor that is providing common services such as storage or packet forwarding (vswitch) to its VMs. See the previous answer for additional discussion.

Q. What are the specific offloads and functions that xPUs offer that NICs and HBAs don't provide today?

A. From a storage offloads point of view, in addition to the data path offloads, the xPU has the integrated SOC CPU cores. Portions of the storage stack or the whole storage application and the control plane could be moved to the xPU.

The addition of accessible CPU cores, programmable pipelines, and directly usable offload engines, coupled to a general-purpose operating system, make the xPU fundamentally different from previous standard NIC- or HBA-based offloads. For the xPU, we're now talking about the infrastructure services offloads with storage applications as one of the key use cases. For that reason, we have this new xPU terminology which describes this new type of device that offloads infrastructure services of the hypervisor functionality. With xPUs, the host CPU cores can be completely freed up for hosting customer applications, containers, and VMs. NICs and HBAs typically offload only specific network or storage functions. xPUs can run an expanded set of agents, data services or applications.

To summarize at a high-level, you have local switching both on the PCIe side and on the network side, together with general purpose processors, plus the degree of programmability of the accelerators and the flexibility in the ways you can use an xPU.

Q. When security offload is enabled, do we still need single flow 100G rate? Can you talk about use cases and where it may be needed?

A. If the application or workload needs 100G line rate (or any other single flow specific rate) encryption and integrity, you need to find a specific xPU model that supports the desired security offload rate. xPU models will have varying capabilities. Typical workloads which might require this scale of single flow rate include storage access across a local network, AI workloads, technical computing, video processing, and large-scale streaming.

Q. When will you be hosting the next xPU webcast?

A. We’re glad you asked! The third presentation in this series will be “xPU Deployment and Solutions Deep Dive” on August 24, 2022 where we will explain key considerations on when, where and how to deploy xPUs. You can register here.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Networked Storage DPU

Blog

SNIA Experts Answer Questions on xPU Accelerator Offload Functions

SNIA Experts Answer Questions on xPU Accelerator Offload Functions

David McIntyre

Jul 14, 2022

The popular xPU webcast series hosted by the SNIA Networking Storage Forum’s continued last month with an in-depth look at accelerator offload functions of the xPU. Our experts discussed the problems the xPUs solve, where in the system they live, and the functions they implement. If you missed the session, you can watch it on-demand and access the presentation slides at the SNIA Educational Library. The Q&A here offers additional insights into the role of the xPU. Q. Since xPUs can see traffic on the host doesn’t that widen the surface area for exposure if it were to be compromised? A. There is another aspect of this question: It depends on who owns control of the device and who’s allowed to run software there. If the organization that runs the infrastructure owns the xPU and controls the software that goes on there, then it can act as a security boundary to the rest of the host which might be running user software or other kinds of software. So, you can use the xPU as a security check in a security boundary and it actually could reduce the total attack surface or provide better security isolation. If you open up the xPU to be just another general-purpose micro server, then it has effectively the same attack surface as the hosting system, but you could run it in a mode or control it in a mode where it actually reduced the total attack service and make that a security boundary. That’s one of the interesting notions that’s come out in the industry on how xPUs can provide value. Q. Before, the host internal-only traffic was only exposed if the host was compromised, but now if the xPU is compromised it might exfiltrate information without the host being aware. Cuts both ways – I get that it is a hardened domain…. but everything gets compromised eventually. A. Any programmable offload engine or hypervisor in a deployment has this same consideration. The xPU is most similar to a hypervisor that is providing common services such as storage or packet forwarding (vswitch) to its VMs. See the previous answer for additional discussion. Q. What are the specific offloads and functions that xPUs offer that NICs and HBAs don’t provide today? A. From a storage offloads point of view, in addition to the data path offloads, the xPU has the integrated SOC CPU cores. Portions of the storage stack or the whole storage application and the control plane could be moved to the xPU. The addition of accessible CPU cores, programmable pipelines, and directly usable offload engines, coupled to a general-purpose operating system, make the xPU fundamentally different from previous standard NIC- or HBA-based offloads. For the xPU, we’re now talking about the infrastructure services offloads with storage applications as one of the key use cases. For that reason, we have this new xPU terminology which describes this new type of device that offloads infrastructure services of the hypervisor functionality. With xPUs, the host CPU cores can be completely freed up for hosting customer applications, containers, and VMs. NICs and HBAs typically offload only specific network or storage functions. xPUs can run an expanded set of agents, data services or applications. To summarize at a high-level, you have local switching both on the PCIe side and on the network side, together with general purpose processors, plus the degree of programmability of the accelerators and the flexibility in the ways you can use an xPU. Q. When security offload is enabled, do we still need single flow 100G rate? Can you talk about use cases and where it may be needed? A. If the application or workload needs 100G line rate (or any other single flow specific rate) encryption and integrity, you need to find a specific xPU model that supports the desired security offload rate. xPU models will have varying capabilities. Typical workloads which might require this scale of single flow rate include storage access across a local network, AI workloads, technical computing, video processing, and large-scale streaming. Q. When will you be hosting the next xPU webcast? A. We’re glad you asked! The third presentation in this series will be “xPU Deployment and Solutions Deep Dive” on August 24, 2022 where we will explain key considerations on when, where and how to deploy xPUs. You can register here.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

DPU Networked Storage DPU

Blog

SmartNICs to xPUs Q&A

SmartNICs to xPUs Q&A

Alex McDonald

Jun 27, 2022

The live audience asked some interesting questions and here are answers from our presenters.

Q. How can we have redundancy on an xPU?

A. xPUs are optimal for optimizing and offloading server/appliance and application redundancy schemes. Being the heart of the data movement and processing at the server, xPUs can expose parallel data-paths and be a reliable control point for server management. Also, the xPUs’ fabric connecting the hosts can provide self-redundancy and elasticity such that redundancy between xPU devices can be seamless and provide simplified redundancy and availability scheme between the different entities in the xPU fabric that is connecting between the servers over the network. The fact that xPUs don’t run the user applications, (or maybe in the worst case run some offload functions for them) makes them a true stable and reliable control point for such redundancy schemes. It’s also possible to put two (or potentially more) xPUs into each server to provide redundancy at the xPU level.

Q. More of a comment. I'm in the SSD space, and with the ramp up in E.1L/S E.3 space is being optimized for these SmartNICs/GPUs, DPUs, etc. Also better utilizing space inside a server/node, and allow for serial interface location on the PCB. Great discussion today.

A. Yes, it’s great to see servers and component devices evolving towards supporting cloud-ready architectures and composable infrastructure for data centers. We anticipate that xPUs will evolve into a variety of physical form factors within the server especially with the modular server component standardization work that is going on. We’re glad you enjoyed the session.

Q. How does CXL impact xPUs and their communication with other components such as DRAM? Will this eliminate DDR and not TCP/IP?

A. xPUs might use CXL as an enhanced interface to the host, to local devices connected to the xPU or to a CXL fabric that acts as an extension of local devices and xPUs network, for example connected to an entity like a shared memory pool. CXL can provide an enhanced, coherent memory interface and can take a role in extending access to slower tiers of memory to the host or devices through the CXL.MEM interface. It can also provide a coherent interface through the CXL.CACHE interface that can create an extended compute interface and allow close interaction between host and devices. We think CXL will provide an additional tier for memory and compute that will be living side by side with current tiers of compute and memory, each having its own merit in different compute scenarios. Will CXL eliminate DDR? Local DDR for the CPU will always have a latency advantage and will provide better compute in some use cases, so CXL memory will add additional tiers of memory/PMEM/storage in addition to that provided by DDR.

Q. Isn't a Fibre Channel (FC) HBA very similar to a DPU, but for FC?

A. The NVMe-oF offloads make the xPU equivalent to an FC HBA, but the xPU can also host additional offloads and services at the same time. Both FC HBAs and xPUs typically accelerate and offload storage networking connections and can enable some amount of remote management. They may also offload storage encryption tasks. However, xPUs typically support general networking and might also support storage tasks, while FC HBAs always support Fibre Channel storage tasks and rarely support any non-storage functions.

Q. Were the old TCP Offload Engine (TOE) cards from Adaptec many years ago considered xPU devices, that were used for iSCSI?

A.They were not considered xPUs as—like FC HBAs—they only offloaded storage networking traffic, in this case for iSCSI traffic over TCP. In addition, the terms “xPU,” “IPU” and “DPU” were not in use at that time. However, TOE and equivalent cards laid the ground work for the evolution to the modern xPU.

Q. For xPU sales to grow dramatically won't that happen after CXL has a large footprint in data centers?

A. The CXL market is focused on a coherent device and memory extension connection to the host, while the xPU market is focused on devices that handle data movement and processing offload for the host connected over networks. As such, CXL and xPU markets are complementary. Each market has its own segment and use case and viability independent on each other. As discussed above, the technical solutions are complements so that the evolution of each market proliferates from the other. Broader adoption of CXL will enable faster and broader functionality for xPUs, but is not required for rapid growth of the xPU market.

Q. What role will CXL play in these disaggregated data centers?

A. The ultimate future of CXL is a little hard to predict. CXL has a potential role in disaggregation of coherent devices and memory pools at the chassis/rack scale level with CXL switch devices, while xPUs have the role of disaggregating at the rack/datacenter level. xPUs will start out connecting multiple servers across multiple racks then extend across the entire data center and potentially across multiple data centers (and potentially from cloud to edge). It is likely that CXL will start out connecting devices within a server then possibly extend across a rack and eventually across multiple racks.

If you are interested in learning more about xPUs, I encourage you to register for our second webcast

“xPU Accelerator Offload Functions”to hear what problems the xPUs are coming to solve, where in the system they live, and the functions they implement.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Networked Storage DPU

Blog

SmartNICs to xPUs Q&A

SmartNICs to xPUs Q&A

Alex McDonald

Jun 27, 2022

The SNIA Networking Storage Forum kicked off its xPU webcast series last month with “SmartNICs to xPUs – Why is the Use of Accelerators Accelerating?” where SNIA experts defined what xPUs are, explained how they can accelerate offload functions, and cleared up confusion on many other names associated with xPUs such as SmartNIC, DPU, IPU, APU, NAPU. The webcast was highly-rated by our audience and already has more than 1,300 views. If you missed it, you can watch it on-demand and download a copy of the presentation slides at the SNIA Educational Library. The live audience asked some interesting questions and here are answers from our presenters. Q. How can we have redundancy on an xPU? A. xPUs are optimal for optimizing and offloading server/appliance and application redundancy schemes. Being the heart of the data movement and processing at the server, xPUs can expose parallel data-paths and be a reliable control point for server management. Also, the xPUs’ fabric connecting the hosts can provide self-redundancy and elasticity such that redundancy between xPU devices can be seamless and provide simplified redundancy and availability scheme between the different entities in the xPU fabric that is connecting between the servers over the network. The fact that xPUs don’t run the user applications, (or maybe in the worst case run some offload functions for them) makes them a true stable and reliable control point for such redundancy schemes. It’s also possible to put two (or potentially more) xPUs into each server to provide redundancy at the xPU level. Q. More of a comment. I’m in the SSD space, and with the ramp up in E.1L/S E.3 space is being optimized for these SmartNICs/GPUs, DPUs, etc. Also better utilizing space inside a server/node, and allow for serial interface location on the PCB. Great discussion today. A. Yes, it’s great to see servers and component devices evolving towards supporting cloud-ready architectures and composable infrastructure for data centers. We anticipate that xPUs will evolve into a variety of physical form factors within the server especially with the modular server component standardization work that is going on. We’re glad you enjoyed the session. Q. How does CXL impact xPUs and their communication with other components such as DRAM? Will this eliminate DDR and not TCP/IP? A. xPUs might use CXL as an enhanced interface to the host, to local devices connected to the xPU or to a CXL fabric that acts as an extension of local devices and xPUs network, for example connected to an entity like a shared memory pool. CXL can provide an enhanced, coherent memory interface and can take a role in extending access to slower tiers of memory to the host or devices through the CXL.MEM interface. It can also provide a coherent interface through the CXL.CACHE interface that can create an extended compute interface and allow close interaction between host and devices. We think CXL will provide an additional tier for memory and compute that will be living side by side with current tiers of compute and memory, each having its own merit in different compute scenarios. Will CXL eliminate DDR? Local DDR for the CPU will always have a latency advantage and will provide better compute in some use cases, so CXL memory will add additional tiers of memory/PMEM/storage in addition to that provided by DDR. Q. Isn’t a Fibre Channel (FC) HBA very similar to a DPU, but for FC? A. The NVMe-oF offloads make the xPU equivalent to an FC HBA, but the xPU can also host additional offloads and services at the same time. Both FC HBAs and xPUs typically accelerate and offload storage networking connections and can enable some amount of remote management. They may also offload storage encryption tasks. However, xPUs typically support general networking and might also support storage tasks, while FC HBAs always support Fibre Channel storage tasks and rarely support any non-storage functions. Q. Were the old TCP Offload Engine (TOE) cards from Adaptec many years ago considered xPU devices, that were used for iSCSI? A.They were not considered xPUs as—like FC HBAs—they only offloaded storage networking traffic, in this case for iSCSI traffic over TCP. In addition, the terms “xPU,” “IPU” and “DPU” were not in use at that time. However, TOE and equivalent cards laid the ground work for the evolution to the modern xPU. Q. For xPU sales to grow dramatically won’t that happen after CXL has a large footprint in data centers? A. The CXL market is focused on a coherent device and memory extension connection to the host, while the xPU market is focused on devices that handle data movement and processing offload for the host connected over networks. As such, CXL and xPU markets are complementary. Each market has its own segment and use case and viability independent on each other. As discussed above, the technical solutions are complements so that the evolution of each market proliferates from the other. Broader adoption of CXL will enable faster and broader functionality for xPUs, but is not required for rapid growth of the xPU market. Q. What role will CXL play in these disaggregated data centers? A. The ultimate future of CXL is a little hard to predict. CXL has a potential role in disaggregation of coherent devices and memory pools at the chassis/rack scale level with CXL switch devices, while xPUs have the role of disaggregating at the rack/datacenter level. xPUs will start out connecting multiple servers across multiple racks then extend across the entire data center and potentially across multiple data centers (and potentially from cloud to edge). It is likely that CXL will start out connecting devices within a server then possibly extend across a rack and eventually across multiple racks. If you are interested in learning more about xPUs, I encourage you to register for our second webcast “xPU Accelerator Offload Functions”to hear what problems the xPUs are coming to solve, where in the system they live, and the functions they implement.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Networked Storage DPU

Blog

Summit Success – and A Preview of What’s To Come

Summit Success – and A Preview of What’s To Come

SNIAOnStorage

Jun 23, 2022

Last month’s SNIA Persistent Memory and Computational Storage Summit (PM+CS Summit) put on a great show with 35 technology presentations from 41 speakers. Every presentation is now available online with a video and PDF found at www.snia.org/pm-summit.

Recently, SNIA On Storage sat down with David McIntyre, Summit Chair from Samsung, on his impressions of this 10^th annual event.

SNIA On Storage (SOS): What were your thoughts on key topics coming into the Summit and did they change based on the presentations? David McIntyre (DM): We were excited to attract technology leaders to speak on the state of computational storage and persistent memory. Both mainstage and breakout speakers did a good job of encapsulating and summarizing what is happening today. Through the different talks, we learned more about infrastructure deployments supporting underlying applications and use cases. A new area where attendees gained insight was computational memory. I find it encouraging that as an industry we are moving forward on focusing on applications and use cases, and supporting software and infrastructure that resides across persistent memory and computational storage. And with computational memory, we are now getting more into the system infrastructure concerns and making these technologies more accessible to application developers. SOS: Any sessions you want to recommend to viewers? DM: We had great feedback on our speakers during the live event. Several sessions I might recommend are Gary Grider of Los Alamos National Labs (LANL), who explained how computational storage is being deployed across his lab; Chris Petersen of Meta, who took an infrastructure view on considerations for persistent memory and computational storage; and Andy Walls of IBM, who presented a nice viewpoint of his vision of computational storage and its underlying benefits that make the overall infrastructure more rich and efficient, and how to bring compute to the drives. For a summary, watch Dave Eggleston of In-Cog Computing who led Tuesday and Wednesday panels with the mainstage speakers that provided a wide ranging discussion on the Summit’s key topics. SOS: What do you see as the top takeaways from the Summit presenters? DM: I see three:

Infrastructure, applications, and use cases were paramount themes across a number of presentations
Tighter coupling of technologies. Cheolmin Park of Samsung, in his CXL and UCIe presentation, discussed how we already have point technologies that now need to interact together. There is also the Persistent Memory/SSD/DRAM combination – a tiered memory configuration talked about for years. We are seeing deployment use cases where the glue is interfacing the I/O technology with CXL and UCIe.
Another takeaway strongly related to the above is heterogeneous operations and compute. Compute can’t reside in one central location for efficiency. Rather, it must be distributed – addressing real-time analytics and decision making to support applications.

SOS: What upcoming activities should Summit viewers plan to attend and why? DM: Put Flash Memory Summit, August 1-4, 2022 on your calendars. Here SNIA will go deeper into areas we explored at the Summit. First, join SNIA Compute, Memory, and Storage Initiative (CMSI), underwriter of the PM+CS Summit, as we meet in person for the first time in a long time at the SNIA Reception on Monday evening August 1 at the Santa Clara Convention Center from 5:30 pm – 7:00 pm. Along with our SNIA teammates from the SNIA Storage Management Initiative, network with colleagues and share an appetizer or two as we gear up for three full days of activities. At the Summit, the SNIA-sponsored System Architecture Track will feature a day on persistent memory, a day on CXL, and a day on computational storage. SNIA will also lead sessions on form factors, ethernet SSDs, sustainability, and DNA data storage. I am Track Manager of the Artificial Intelligence Applications Track, where we will see how technologies like computational storage and AI work hand-in-hand. SNIA will have a Demonstration Pavilion at booth 725 in the FMS Exhibit Hall with live demonstrations of computational storage applications, persistent memory implementations, and scalable storage management with SNIA Alliance Partners; hands-on form factor displays; a Persistent Memory Programming Workshop and Hackathon; and theater presentations on standards. Full details are at www.flashmemorysummit.com In September, CMSI will be at the SNIA Storage Developer Conference where we will celebrate SNIA’s 25th anniversary and gather in person for sessions, demonstrations, and those ever popular Birds-of-a-Feather sessions. Find the latest details at www.storagedeveloper.org. SOS: Any final thoughts? DM: On behalf of SNIA CMSI and the PM+CS Summit Planning Team, I’d like to thank all those who planned and attended our great event. We are progressing in the right direction, beginning to talk the same language that application developers and solution providers understand. We’ll keep building our strategic collaboration across different worlds at FMS and SDC. I appreciate the challenges and working together. The post Summit Success – and A Preview of What’s To Come first appeared on SNIA Compute, Memory and Storage Blog.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Computational Storage

Blog

xPU Accelerator Offload Functions

xPU Accelerator Offload Functions

David McIntyre

Jun 10, 2022

As covered in our first xPU webcast “SmartNICs and xPUs: Why is the Use of Accelerators Accelerating,” we discussed the trend to deploy dedicated accelerator chips to assist or offload the main CPU. These new accelerators (xPUs) have multiple names such as SmartNIC, DPU, IPU, APU, NAPU. If you missed the presentation, I encourage you to check it out in the SNIA Educational Library where you can watch it on-demand and access the presentation slides. This second webcast in this SNIA Networking Storage Forum xPU webcast series is “xPU Accelerator Offload Functions” where our SNIA experts will take a deeper dive into the accelerator offload functions of the xPU. We’ll discuss what problems the xPUs are coming to solve, where in the system they live, and the functions they implement, focusing on:

Network Offloads
- Virtual switching and NPU
- P4 pipelines
- QoS and policy enforcement
- NIC functions
- Gateway functions (tunnel termination, load balancing, etc)
Security Offloads
- Encryption
- Policy enforcement
- Key management and crypto
- Regular expression matching
- Firewall
- Deep Packet Inspection (DPI)
Compute Offloads
- AI calculations, model resolution
- General purpose processing (via local cores)
- Emerging use of P4 for general purpose
Storage Offloads
- Compression and data at rest encryption
- NVMe-oF offload
- Regular expression matching
- Storage stack offloads

This webcast will be live on June 29, 2022 at 11:00 am PT/2:00 pm ET. I encourage you to register today and bring your questions for our experts. We look forward to seeing you.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Networked Storage NVMe-oF DPU

Blog

Keeping Edge Data Secure Q&A

Keeping Edge Data Secure Q&A

David McIntyre

Jun 9, 2022

The complex and changeable structure of edge computing, together with its network connections, massive real-time data, challenging operating environment, distributed edge cloud collaboration, and other characteristics, create a multitude of security challenges. It was the topic of our SNIA Networking Storage Forum (NSF) live webcast “Storage Life on the Edge: Security Challenges” where SNIA security experts Thomas Rivera, CISSP, CIPP/US, CDPSE and Eric Hibbard, CISSP-ISSAP, ISSMP, ISSEP, CIPP/US, CIPT, CISA, CDPSE, CCSK debated as to whether existing security practices and standards are adequate for this emerging area of computing. If you missed the presentation, you can view it on-demand here.

It was a fascinating discussion and as promised, Eric and Thomas have answered the questions from our live audience.

Q. What complexities are introduced from a security standpoint for edge use cases?

A. The sheer number of edge nodes, the heterogeneity of the nodes, distributed ownership and control, increased number of interfaces, fit-for-use versus designed solution, etc. complicate the security aspects of these ecosystems. Performing risk assessments and/or vulnerability assessments across the full ecosystem can be extremely difficult; remediation activities can be even harder.

Q. How is data privacy impacted and managed across cloud to edge applications?

A. Movement of data from the edge to core systems could easily cross multiple jurisdictions that have different data protection/privacy requirements. For example, personal information harvested in the EU might find its way into core systems in the US; in such a situation, the US entity would need to deal with GDPR requirements or face significant penalties. The twist is that the operator of the core systems might not know anything about the source of the data.

Q. What are the priority actions that customers can undertake to protect their data?

A. Avoid giving personal information. If you do, understand your rights (if any) as well as how it will be used, protected, and ultimately eliminated.

This session is part of our “Storage Life on the Edge” webcast series. Our next session will be “Storage Life on the Edge: Accelerated Performance Strategies” where we will dive into the need for faster computing, access to storage, and movement of data at the edge as well as between the edge and the data center. Register here to join us on July 12, 2022. You can also access the other presentations we’ve done in this series at the SNIA Educational Library.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Data Protection Edge Computing Storage

Subscribe to

Kubernetes is Everywhere Q&A

Find a similar article by tags

Leave a Reply

Join Us as We Return Live to FMS!

Find a similar article by tags

Leave a Reply

A Deep Dive on xPU Deployment and Solutions

Find a similar article by tags

Leave a Reply

SNIA Experts Answer Questions on xPU Accelerator Offload Functions

Find a similar article by tags

Leave a Reply

SNIA Experts Answer Questions on xPU Accelerator Offload Functions

Find a similar article by tags

Leave a Reply

SmartNICs to xPUs Q&A

Find a similar article by tags

Leave a Reply

SmartNICs to xPUs Q&A

Find a similar article by tags

Leave a Reply

Summit Success – and A Preview of What’s To Come

Find a similar article by tags

Leave a Reply

xPU Accelerator Offload Functions

Find a similar article by tags

Leave a Reply

Keeping Edge Data Secure Q&A

Find a similar article by tags

Leave a Reply