Sorry, you need to enable JavaScript to visit this website.

Here’s Why Ceph is the Linux of Storage Today

Erin Farr

Feb 14, 2024

title of post
Data is one of the most critical resources of our time. Storage for data has always been a critical architectural element for every data center, requiring careful considerations for storage performance, scalability, reliability, data protection, durability and resilience. A decade ago, the market was aggressively embracing public storage because of its agility and scalability. In the last few years, people have been rethinking that approach, moving toward on-premises storage with cloud consumption models. The new cloud native architecture on-premises has the promise of the traditional data center’s security and reliability with cloud agility and scalability. Ceph, an Open Source project for enterprise unified software-defined storage, represents a compelling solution for this cloud native on-premises architecture and will be the topic of our next SNIA Cloud Storage Technologies Initiative webinar, “Ceph: The Linux of Storage Today.” This webinar will discuss:
  • How Ceph targets important characteristics of modern software-defined data centers
  • Use cases that illustrate how Ceph has evolved, along with future use cases
  • Quantitative data points that exemplify Ceph’s community success
We will describe how Ceph is gaining industry momentum, satisfying enterprise architectures’ data storage needs and how the technology community is investing to enable the vision of “Ceph, the Linux of Storage Today.” Register today to join us for this timely discussion.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Throughput, IOPs, and Latency Q&A

Erik Smith

Feb 12, 2024

title of post
Throughput, IOPs, and latency are three terms often referred to as storage performance metrics. But the exact definitions of these terms and how they differ can be confusing. That’s why the SNIA Networking Storage Forum (NSF) brought back our popular webinar series, “Everything You Wanted to Know About Storage, But Were Too Proud to Ask,” with a live webinar, “Everything You Wanted to Know about Throughput, IOPs, and Latency But Were Too Proud to Ask.” The live session was a hit with over 850 views in the first 48 hours. If you missed the live event, you can watch it on-demand. Our audience asked several interesting questions, here are our answer to them. Q: Discussing congestion and mechanisms at play in RoCEv2 (DCQCN and delay-change control) would be more interesting than legacy BB_credit handling in FC SAN... A: FC’s BB_Credit mechanism was chosen for the sake of example, but many of the concepts discussed such as oversubscription, congestion, congestion spreading, etc apply to other transport protocols as well. For example, RoCE can benefit from the use of explicit congestion control mechanisms such as DCQCN, or a combination of PFC and ECN, and then there are vendor specific approaches such as RTTCC that can be used to minimize the dependency on the need for explicit congestion control. Q: Does NVMe have or need the equivalent of SCSI ALUA? A: Yes, the NVMe equivalent of SCSI’s ALUA is called Asymmetric Namespace Access (ANA). Q: Will CXL fundamentally change IO performance and how soon will CXL become prevalent? A: CXL runs over PCIe, and as a result will be subject to many of the same constraints, especially distance. Most of the CXL use cases will mainly be applicable at rack scale, therefore we do not anticipate it being used for IO between a compute node and external storage. Q: Do people use a single storage system for various phases of AI/ML use cases (e.g., data collection and training) -- or do they use different storage systems (e.g., one storage system for the data collection phase and a different storage system for AI/ML training)? A: Typically, the same storage system is used for ingestion and checkpointing. In either case, high performance is key because the more time the system spends on data transfer for either of these phases, the longer the GPUs remain idle and consequently for training to complete. Q: In the increasing world of hyperconverged ("shared nothing") architecture where storage is spread across different nodes, what is the best approach to increase IOPs especially for applications like video and voice where real-time response times are important? A: Using a scale-out approach (i.e., adding nodes that provide storage services) is the most common way to increase overall system performance for these kinds of deployments. Ensuring the network is properly sized and not congested as well as ensuring appropriate compute resources in each node can help. Q: Why is small file I/O overhead so large? Metadata generally is still much smaller compared to files of a few KB or larger. A: While metadata is a small percentage of the data transfer phase for any file transfer and is approximately the same percentage for any file size, small files require the same number of host interactions to initiate and complete the I/O. This host interaction of a command for the operation followed by the device return of completion notification is what adds to the percentage of overhead for small file transfers. For example, if a file is only 128 bytes long but there is a 32 byte request and a 32 byte completion, the overhead is 33% of the time required for the file transfer. For a 4 KB file transfer, this overhead is reduced to 1.5% of the time required to transfer the file. Q: Is SPC 1/2 still a relevant benchmark for IO performance? If not, what is the gold standard today? A: Yes, Storage Performance Council Benchmarks SPC 1/2 are still relevant and are continuing to be updated. You can also use tools like Vdbench and FIO for performance characterization. That said, the best benchmark to use is the one that most closely matches the application you expect to be running on the infrastructure you are testing. As AI becomes increasingly more important you can also checkout MLPerf.  

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Throughput, IOPs, and Latency Q&A

Erik Smith

Feb 12, 2024

title of post
Throughput, IOPs, and latency are three terms often referred to as storage performance metrics. But the exact definitions of these terms and how they differ can be confusing. That’s why the SNIA Networking Storage Forum (NSF) brought back our popular webinar series, “Everything You Wanted to Know About Storage, But Were Too Proud to Ask,” with a live webinar, “Everything You Wanted to Know about Throughput, IOPs, and Latency But Were Too Proud to Ask.” The live session was a hit with over 850 views in the first 48 hours. If you missed the live event, you can watch it on-demand. Our audience asked several interesting questions, here are our answer to them. Q: Discussing congestion and mechanisms at play in RoCEv2 (DCQCN and delay-change control) would be more interesting than legacy BB_credit handling in FC SAN… A: FC’s BB_Credit mechanism was chosen for the sake of example, but many of the concepts discussed such as oversubscription, congestion, congestion spreading, etc apply to other transport protocols as well. For example, RoCE can benefit from the use of explicit congestion control mechanisms such as DCQCN, or a combination of PFC and ECN, and then there are vendor specific approaches such as RTTCC that can be used to minimize the dependency on the need for explicit congestion control. Q: Does NVMe have or need the equivalent of SCSI ALUA? A: Yes, the NVMe equivalent of SCSI’s ALUA is called Asymmetric Namespace Access (ANA). Q: Will CXL fundamentally change IO performance and how soon will CXL become prevalent? A: CXL runs over PCIe, and as a result will be subject to many of the same constraints, especially distance. Most of the CXL use cases will mainly be applicable at rack scale, therefore we do not anticipate it being used for IO between a compute node and external storage. Q: Do people use a single storage system for various phases of AI/ML use cases (e.g., data collection and training) — or do they use different storage systems (e.g., one storage system for the data collection phase and a different storage system for AI/ML training)? A: Typically, the same storage system is used for ingestion and checkpointing. In either case, high performance is key because the more time the system spends on data transfer for either of these phases, the longer the GPUs remain idle and consequently for training to complete. Q: In the increasing world of hyperconverged (“shared nothing”) architecture where storage is spread across different nodes, what is the best approach to increase IOPs especially for applications like video and voice where real-time response times are important? A: Using a scale-out approach (i.e., adding nodes that provide storage services) is the most common way to increase overall system performance for these kinds of deployments. Ensuring the network is properly sized and not congested as well as ensuring appropriate compute resources in each node can help. Q: Why is small file I/O overhead so large? Metadata generally is still much smaller compared to files of a few KB or larger. A: While metadata is a small percentage of the data transfer phase for any file transfer and is approximately the same percentage for any file size, small files require the same number of host interactions to initiate and complete the I/O. This host interaction of a command for the operation followed by the device return of completion notification is what adds to the percentage of overhead for small file transfers. For example, if a file is only 128 bytes long but there is a 32 byte request and a 32 byte completion, the overhead is 33% of the time required for the file transfer. For a 4 KB file transfer, this overhead is reduced to 1.5% of the time required to transfer the file. Q: Is SPC 1/2 still a relevant benchmark for IO performance? If not, what is the gold standard today? A: Yes, Storage Performance Council Benchmarks SPC 1/2 are still relevant and are continuing to be updated. You can also use tools like Vdbench and FIO for performance characterization. That said, the best benchmark to use is the one that most closely matches the application you expect to be running on the infrastructure you are testing. As AI becomes increasingly more important you can also checkout MLPerf.   The post Throughput, IOPs, and Latency Q&A first appeared on SNIA on Network Storage.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Power Efficiency Measurement – Our Experts Make It Clear – Part 2

title of post
Measuring power efficiency in datacenter storage is a complex endeavor. A number of factors play a role in assessing individual storage devices or system-level logical storage for power efficiency. Luckily, our SNIA experts make the measuring easier! In this SNIA Experts on Data blog series, our experts in the SNIA Solid State Storage Technical Work Group and the SNIA Green Storage Initiative explore factors to consider in power efficiency measurement, including the nature of application workloads, IO streams, and access patterns; the choice of storage products (SSDs, HDDs, cloud storage, and more); the impact of hardware and software components (host bus adapters, drivers, OS layers); and access to read and write caches, CPU and GPU usage, and DRAM utilization. Join us on our journey to better power efficiency as we continue with Part 2: Impact of Workloads on Power Efficiency Measurement.  And if you missed Part 1: Key Issues in Power Efficiency Measurement, you can find it here.  Bookmark this blog  and check back in March and April for the continuation of our four-part series. And explore the topic further in the SNIA Green Storage Knowledge Center. Part 2: Impact of Workloads on Power Efficiency Measurement Workloads are a significant driving force behind power consumption in computing systems. Different tasks and applications place diverse demands on hardware, leading to fluctuations in the amount of power used. Here's a breakdown of how workloads can influence power consumption:
  • CPU Utilization. The CPU's power consumption increases as it processes tasks, with more demanding workloads that involve complex calculations or multitasking leading to higher CPU utilization and, consequently, elevated power usage.
  • Memory Access is another key factor. Accessing memory modules consumes power, and workloads that heavily rely on frequent memory read and write operations can significantly contribute to increased power consumption.
  • Disk Activity, particularly read and write operations on storage devices (whether HDDs or SSDs), consumes power. Workloads that involve frequent data access or large file transfers can lead to an uptick in power consumption. GPU Usage plays a crucial role, especially in tasks like gaming, video editing, and machine learning. High GPU utilization for rendering complex graphics or training deep neural networks can result in substantial power consumption.
  • Network Communication tasks, such as data transfers, streaming, or online gaming, require power from both the CPU and the network interface. The extent of communication and data throughput can significantly affect overall power usage.
  • In devices equipped with displays, Screen Brightness directly impacts power consumption. Brighter screens consume more power, which means workloads involving continuous display usage contribute to higher power consumption.
  • I/O Operations encompass interactions with peripherals like storage devices or printers. These operations can lead to short bursts of power consumption, especially if multiple devices are connected.
  • Understanding the contrast between Idle and Active States is essential. Different workloads can transition devices between these states, with idle periods generally exhibiting lower power consumption. However, certain workloads may keep components active even during seemingly idle times.
  • Dynamic Voltage and Frequency Scaling are prevalent in many systems, allowing them to adjust the voltage and frequency of components based on workload demands. Increased demand leads to higher clock speeds and voltage, ultimately resulting in more significant power consumption.
  • Background Processes also come into play. Background applications, updates, and system maintenance tasks can impact power consumption, even when the user isn't actively engaging with the device.
In practical terms, comprehending how various workloads affect power consumption is vital for optimizing energy efficiency. For instance, laptops can extend their battery life by reducing screen brightness, closing unnecessary applications, and selecting power-saving modes. Moreover, SSDs are designed with optimizations for background processes in mind. Garbage collection and NAND Flash cell management often occur during idle periods or periods of low-impact workloads. Likewise, data centers and cloud providers strategically manage workloads to minimize energy consumption and operational costs while upholding performance standards.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Power Efficiency Measurement – Our Experts Make It Clear – Part 2

title of post
Measuring power efficiency in datacenter storage is a complex endeavor. A number of factors play a role in assessing individual storage devices or system-level logical storage for power efficiency. Luckily, our SNIA experts make the measuring easier! In this SNIA Experts on Data blog series, our experts in the SNIA Solid State Storage Technical Work Group and the SNIA Green Storage Initiative explore factors to consider in power efficiency measurement, including the nature of application workloads, IO streams, and access patterns; the choice of storage products (SSDs, HDDs, cloud storage, and more); the impact of hardware and software components (host bus adapters, drivers, OS layers); and access to read and write caches, CPU and GPU usage, and DRAM utilization. Join us on our journey to better power efficiency as we continue with Part 2: Impact of Workloads on Power Efficiency Measurement.  And if you missed Part 1: Key Issues in Power Efficiency Measurement, you can find it here.  Bookmark this blog  and check back in March and April for the continuation of our four-part series. And explore the topic further in the SNIA Green Storage Knowledge Center. Part 2: Impact of Workloads on Power Efficiency Measurement Workloads are a significant driving force behind power consumption in computing systems. Different tasks and applications place diverse demands on hardware, leading to fluctuations in the amount of power used. Here’s a breakdown of how workloads can influence power consumption:
  • CPU Utilization. The CPU’s power consumption increases as it processes tasks, with more demanding workloads that involve complex calculations or multitasking leading to higher CPU utilization and, consequently, elevated power usage.
  • Memory Access is another key factor. Accessing memory modules consumes power, and workloads that heavily rely on frequent memory read and write operations can significantly contribute to increased power consumption.
  • Disk Activity, particularly read and write operations on storage devices (whether HDDs or SSDs), consumes power. Workloads that involve frequent data access or large file transfers can lead to an uptick in power consumption. GPU Usage plays a crucial role, especially in tasks like gaming, video editing, and machine learning. High GPU utilization for rendering complex graphics or training deep neural networks can result in substantial power consumption.
  • Network Communication tasks, such as data transfers, streaming, or online gaming, require power from both the CPU and the network interface. The extent of communication and data throughput can significantly affect overall power usage.
  • In devices equipped with displays, Screen Brightness directly impacts power consumption. Brighter screens consume more power, which means workloads involving continuous display usage contribute to higher power consumption.
  • I/O Operations encompass interactions with peripherals like storage devices or printers. These operations can lead to short bursts of power consumption, especially if multiple devices are connected.
  • Understanding the contrast between Idle and Active States is essential. Different workloads can transition devices between these states, with idle periods generally exhibiting lower power consumption. However, certain workloads may keep components active even during seemingly idle times.
  • Dynamic Voltage and Frequency Scaling are prevalent in many systems, allowing them to adjust the voltage and frequency of components based on workload demands. Increased demand leads to higher clock speeds and voltage, ultimately resulting in more significant power consumption.
  • Background Processes also come into play. Background applications, updates, and system maintenance tasks can impact power consumption, even when the user isn’t actively engaging with the device.
In practical terms, comprehending how various workloads affect power consumption is vital for optimizing energy efficiency. For instance, laptops can extend their battery life by reducing screen brightness, closing unnecessary applications, and selecting power-saving modes. Moreover, SSDs are designed with optimizations for background processes in mind. Garbage collection and NAND Flash cell management often occur during idle periods or periods of low-impact workloads. Likewise, data centers and cloud providers strategically manage workloads to minimize energy consumption and operational costs while upholding performance standards. The post Power Efficiency Measurement – Our Experts Make It Clear – Part 2 first appeared on SNIA Compute, Memory and Storage Blog.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Complexities of Object Storage Compatibility Q&A

Gregory Touretsky

Jan 26, 2024

title of post
72% of organizations have encountered incompatibility issues between various object storage implementations according to a poll at our recent SNIA Cloud Storage Technologies Initiative webinar, “Navigating the Complexities of Object Storage Compatibility.” If you missed the live presentation or you would like to see the answers to the other poll questions we asked the audience, you can view it on-demand at the SNIA Educational Library. The audience was highly-engaged during the live event and asked several great questions. Here are answers to them all. Q. Do you see the need for fast object storage for AI kind of workloads? A. Yes, the demand for fast object storage in AI workloads is growing. Initially, object storage was mainly used for backup or archival purposes. However, its evolution into Data Lakes and the introduction of features like the S3 SELECT API have made it more suitable for data analytics. The launch of Amazon's S3 Express, a faster yet more expensive tier, is a clear indication of this trend. Other vendors are following suit, suggesting a shift towards object storage as a primary data storage platform for specific workloads. Q. As Object Storage becomes more prevalent in the primary storage space, could you talk about data protection, especially functionalities like synchronous replication and multi-site deployments - or is your view that this is not needed for object storage deployments? A. Data protection, including functionalities like synchronous replication and multi-site deployments, is essential for object storage, especially as it becomes more prevalent in primary storage. Various object storage implementations address this differently. For instance, Amazon S3 supports asynchronous replication. Azure ZRS (Zone-redundant storage) offers something akin to synchronous replication within a specific geographical area. Many on-premises solutions provide multi-site deployment and replication capabilities. It's crucial for vendors to offer distinct features and value additions, giving customers a range of choices to best meet their specific requirements. Ultimately, customers must define their data availability and durability needs and select the solution that aligns with their use case. Q. Regarding polling question #3 during the webinar, why did the question only ask “above 10PB?” We look for multi-PB like 100PB ... does this mean Object Storage is not suitable for multi PB? A. Object storage is inherently scalable and can support deployments ranging from petabyte to exabyte scale. However, scalability can vary based on specific implementations. Each object storage solution may have its own limits in terms of capacity. It's important to review the details of each solution to ensure it meets your specific needs for multi-petabyte scale deployments. Q. Is Wasabi 100% Compatible with Amazon S3? A. While we typically avoid discussing specific vendors in a general forum, it's important to note that most 'S3-compatible' object storage implementations have some discrepancies when compared to Amazon S3. These differences can vary in significance. Therefore, we always recommend testing your actual workload against the specific object storage solution to identify any critical issues or incompatibilities. Q. What are the best ways to see a unified view of different types of storage -- including objects, file and blocks? This may be most relevant for enterprise-wide data tracking and multi-cloud deployments. A. There are various solutions available from different vendors that offer visibility into multiple types of storage, including object, file, and block storage. These solutions are particularly useful for enterprise-wide data management and multi-cloud deployments. However, this topic extends beyond the scope of our current discussion. SNIA might consider addressing this in a separate, dedicated webinar in the future. Q. Is there any standard object storage implementation against which the S3 compatibility would be defined? A. Amazon S3 serves as a de facto standard for object storage implementation. Independent software vendors (ISVs) can decide the degree of compatibility they want to achieve with Amazon S3, including which features to implement and to what extent. The objective isn't necessarily to achieve identical functionality across all implementations, but rather for each ISV to be cognizant of the specific differences and potential incompatibilities in their own solutions. Being aware of these discrepancies is key, even if complete compatibility isn't achieved. Q. With the introduction of directory buckets, do you anticipate vendors picking up compatibility there as well or maintaining a strictly flat namespace? A. That's an intriguing question. We are putting together an on-going object storage forum, which will delve into more in follow-up calls, and will serve as a platform for these kinds of discussions. We anticipate addressing not only the concept of directory buckets versus a flat namespace, but also exploring other ideas like performance enhancements and alternate transport layers for S3. This forum is intended to be a collaborative space for discussing future directions in object storage. If you’re interested, contact the cloudtwg@snia.org. Q. How would an incompatibility be categorized as something that is important for clients vs. just something that doesn't meet the AWS spec/behavior? A. Incompatibilities should be assessed based on the specific needs and priorities of each implementor. While we don't set universal compatibility goals, it's up to every implementor to determine how closely they align with S3 or other protocols. They must decide whether to address any discrepancies in behavior or functionality based on their own objectives and their clients' requirements. Essentially, the significance of an incompatibility is determined by its impact on the implementor's goals and client needs. Q. Have customers experienced incompatibilities around different SDKs with regard to HA behaviors? Load balancers vs. round robin DNS vs. other HA techniques on-prem and in the cloud? A. Yes, customers do encounter incompatibilities related to different SDKs, particularly concerning high availability (HA) behaviors. Object storage encompasses more than just APIs; it also involves implementation choices like load balancing decisions and HA techniques. Discrepancies often arise due to these differences, especially when object storage solutions are deployed within a customer's data center and need to integrate with the existing networking infrastructure. These incompatibilities can be due to various factors, including whether load balancing is handled through round robin DNS, dedicated load balancers, or other HA techniques, either on-premises or in the cloud. Q. Any thoughts on keeping pace with AWS as they evolve the S3 API? I'm specifically thinking about the new Directory Bucket type and the associated API changes to support hierarchy. A. We at the SNIA Cloud Storage Technical Work Group are in dialogue with Amazon and are encouraging their participation in our planned Plugfest at SDC’24. Their involvement would be invaluable in helping us anticipate upcoming changes and understand new developments, such as the Directory Bucket type and its associated API changes. This new variation of S3 from Amazon, which differs from the original implementation, underscores the importance of compatibility testing. While complete compatibility may not always be achievable, it's crucial for ISVs to be fully aware of how their implementations differ from S3's evolving standards. Q. When it comes to object store data protection with backup software, do you see also some incompatibilities with recovered data? A. When data is backed up to an object storage system, there's a fundamental expectation that it can be reliably retrieved later. This reliability is a cornerstone of any storage platform. However, issues can arise when data is initially stored in one specific object storage implementation and later transferred to a different one. If this transfer isn't executed in accordance with the backup software provider's requirements, it could lead to difficulties in accessing the data in the future. Therefore, careful planning and adherence to recommended practices are crucial during any data migration process to prevent such compatibility issues. The SNIA Cloud Storage Technical Work Group is actively working on this topic. If you want to get involved, reach out at cloudtwg@snia.org and follow us @sniacloud_com

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Complexities of Object Storage Compatibility Q&A

Gregory Touretsky

Jan 26, 2024

title of post
72% of organizations have encountered incompatibility issues between various object storage implementations according to a poll at our recent SNIA Cloud Storage Technologies Initiative webinar, “Navigating the Complexities of Object Storage Compatibility.” If you missed the live presentation or you would like to see the answers to the other poll questions we asked the audience, you can view it on-demand at the SNIA Educational Library. The audience was highly-engaged during the live event and asked several great questions. Here are answers to them all. Q. Do you see the need for fast object storage for AI kind of workloads? A. Yes, the demand for fast object storage in AI workloads is growing. Initially, object storage was mainly used for backup or archival purposes. However, its evolution into Data Lakes and the introduction of features like the S3 SELECT API have made it more suitable for data analytics. The launch of Amazon’s S3 Express, a faster yet more expensive tier, is a clear indication of this trend. Other vendors are following suit, suggesting a shift towards object storage as a primary data storage platform for specific workloads. Q. As Object Storage becomes more prevalent in the primary storage space, could you talk about data protection, especially functionalities like synchronous replication and multi-site deployments – or is your view that this is not needed for object storage deployments? A. Data protection, including functionalities like synchronous replication and multi-site deployments, is essential for object storage, especially as it becomes more prevalent in primary storage. Various object storage implementations address this differently. For instance, Amazon S3 supports asynchronous replication. Azure ZRS (Zone-redundant storage) offers something akin to synchronous replication within a specific geographical area. Many on-premises solutions provide multi-site deployment and replication capabilities. It’s crucial for vendors to offer distinct features and value additions, giving customers a range of choices to best meet their specific requirements. Ultimately, customers must define their data availability and durability needs and select the solution that aligns with their use case. Q. Regarding polling question #3 during the webinar, why did the question only ask “above 10PB?” We look for multi-PB like 100PB … does this mean Object Storage is not suitable for multi PB? A. Object storage is inherently scalable and can support deployments ranging from petabyte to exabyte scale. However, scalability can vary based on specific implementations. Each object storage solution may have its own limits in terms of capacity. It’s important to review the details of each solution to ensure it meets your specific needs for multi-petabyte scale deployments. Q. Is Wasabi 100% Compatible with Amazon S3? A. While we typically avoid discussing specific vendors in a general forum, it’s important to note that most ‘S3-compatible’ object storage implementations have some discrepancies when compared to Amazon S3. These differences can vary in significance. Therefore, we always recommend testing your actual workload against the specific object storage solution to identify any critical issues or incompatibilities. Q. What are the best ways to see a unified view of different types of storage — including objects, file and blocks? This may be most relevant for enterprise-wide data tracking and multi-cloud deployments. A. There are various solutions available from different vendors that offer visibility into multiple types of storage, including object, file, and block storage. These solutions are particularly useful for enterprise-wide data management and multi-cloud deployments. However, this topic extends beyond the scope of our current discussion. SNIA might consider addressing this in a separate, dedicated webinar in the future. Q. Is there any standard object storage implementation against which the S3 compatibility would be defined? A. Amazon S3 serves as a de facto standard for object storage implementation. Independent software vendors (ISVs) can decide the degree of compatibility they want to achieve with Amazon S3, including which features to implement and to what extent. The objective isn’t necessarily to achieve identical functionality across all implementations, but rather for each ISV to be cognizant of the specific differences and potential incompatibilities in their own solutions. Being aware of these discrepancies is key, even if complete compatibility isn’t achieved. Q. With the introduction of directory buckets, do you anticipate vendors picking up compatibility there as well or maintaining a strictly flat namespace? A. That’s an intriguing question. We are putting together an on-going object storage forum, which will delve into more in follow-up calls, and will serve as a platform for these kinds of discussions. We anticipate addressing not only the concept of directory buckets versus a flat namespace, but also exploring other ideas like performance enhancements and alternate transport layers for S3. This forum is intended to be a collaborative space for discussing future directions in object storage. If you’re interested, contact the cloudtwg@snia.org. Q. How would an incompatibility be categorized as something that is important for clients vs. just something that doesn’t meet the AWS spec/behavior? A. Incompatibilities should be assessed based on the specific needs and priorities of each implementor. While we don’t set universal compatibility goals, it’s up to every implementor to determine how closely they align with S3 or other protocols. They must decide whether to address any discrepancies in behavior or functionality based on their own objectives and their clients’ requirements. Essentially, the significance of an incompatibility is determined by its impact on the implementor’s goals and client needs. Q. Have customers experienced incompatibilities around different SDKs with regard to HA behaviors? Load balancers vs. round robin DNS vs. other HA techniques on-prem and in the cloud? A. Yes, customers do encounter incompatibilities related to different SDKs, particularly concerning high availability (HA) behaviors. Object storage encompasses more than just APIs; it also involves implementation choices like load balancing decisions and HA techniques. Discrepancies often arise due to these differences, especially when object storage solutions are deployed within a customer’s data center and need to integrate with the existing networking infrastructure. These incompatibilities can be due to various factors, including whether load balancing is handled through round robin DNS, dedicated load balancers, or other HA techniques, either on-premises or in the cloud. Q. Any thoughts on keeping pace with AWS as they evolve the S3 API? I’m specifically thinking about the new Directory Bucket type and the associated API changes to support hierarchy. A. We at the SNIA Cloud Storage Technical Work Group are in dialogue with Amazon and are encouraging their participation in our planned Plugfest at SDC’24. Their involvement would be invaluable in helping us anticipate upcoming changes and understand new developments, such as the Directory Bucket type and its associated API changes. This new variation of S3 from Amazon, which differs from the original implementation, underscores the importance of compatibility testing. While complete compatibility may not always be achievable, it’s crucial for ISVs to be fully aware of how their implementations differ from S3’s evolving standards. Q. When it comes to object store data protection with backup software, do you see also some incompatibilities with recovered data? A. When data is backed up to an object storage system, there’s a fundamental expectation that it can be reliably retrieved later. This reliability is a cornerstone of any storage platform. However, issues can arise when data is initially stored in one specific object storage implementation and later transferred to a different one. If this transfer isn’t executed in accordance with the backup software provider’s requirements, it could lead to difficulties in accessing the data in the future. Therefore, careful planning and adherence to recommended practices are crucial during any data migration process to prevent such compatibility issues. The SNIA Cloud Storage Technical Work Group is actively working on this topic. If you want to get involved, reach out at cloudtwg@snia.org and follow us @sniacloud_com

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Here’s Everything You Wanted to Know About Throughput, IOPs, and Latency

Erik Smith

Jan 9, 2024

title of post
Any discussion about storage systems is incomplete without the mention of Throughput, IOPs, and Latency. But what exactly do these terms mean, and why are they important? To answer these questions, the SNIA Networking Storage Forum (NSF) is bringing back our popular webinar series, “Everything You Wanted to Know About Storage, But Were Too Proud to Ask.” Collectively, these three terms are often referred to as storage performance metrics. Performance can be defined as the effectiveness of a storage system to address I/O needs of an application or workload. Different application workloads have different I/O patterns, and with that arises different bottlenecks, so there is no “one-size fits all” in storage systems. These storage performance metrics help with storage solution design and selection based on application/workload demands. Join us on February 7, 2024, for “Everything You Wanted to Know About Throughput, IOPS, and Latency, But Were Too Proud to Ask.” In this webinar, we’ll cover:
  • What storage performance metrics mean – understanding key terminology nuances
  • Why users/storage administrators should care about them
  • How these metrics impact application performance
  • Real-world use cases
It’s a live event and our experts will be available to answer any questions you may have. I hope you’ll register today. If you are not familiar with our “Too Proud to Ask” series, I encourage you to check out these 11 great on-demand webinars that provide 101 lessons on a wide range of storage terminology.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Here’s Everything You Wanted to Know About Throughput, IOPs, and Latency

Erik Smith

Jan 9, 2024

title of post
Any discussion about storage systems is incomplete without the mention of Throughput, IOPs, and Latency. But what exactly do these terms mean, and why are they important? To answer these questions, the SNIA Networking Storage Forum (NSF) is bringing back our popular webinar series, “Everything You Wanted to Know About Storage, But Were Too Proud to Ask.” Collectively, these three terms are often referred to as storage performance metrics. Performance can be defined as the effectiveness of a storage system to address I/O needs of an application or workload. Different application workloads have different I/O patterns, and with that arises different bottlenecks, so there is no “one-size fits all” in storage systems. These storage performance metrics help with storage solution design and selection based on application/workload demands. Join us on February 7, 2024, for “Everything You Wanted to Know About Throughput, IOPS, and Latency, But Were Too Proud to Ask.” In this webinar, we’ll cover:
  • What storage performance metrics mean – understanding key terminology nuances
  • Why users/storage administrators should care about them
  • How these metrics impact application performance
  • Real-world use cases
It’s a live event and our experts will be available to answer any questions you may have. I hope you’ll register today. If you are not familiar with our “Too Proud to Ask” series, I encourage you to check out these 11 great on-demand webinars that provide 101 lessons on a wide range of storage terminology. The post Here’s Everything You Wanted to Know About Throughput, IOPs, and Latency first appeared on SNIA on Network Storage.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Power Efficiency Measurement – Our Experts Make It Clear – Part 1

title of post
Measuring power efficiency in datacenter storage is a complex endeavor. A number of factors play a role in assessing individual storage devices or system-level logical storage for power efficiency. Luckily, our SNIA experts make the measuring easier! In this SNIA Experts on Data blog series, our experts in the SNIA Solid State Storage Technical Work Group and the SNIA Green Storage Initiative explore factors to consider in power efficiency measurement, including the nature of application workloads, IO streams, and access patterns; the choice of storage products (SSDs, HDDs, cloud storage, and more); the impact of hardware and software components (host bus adapters, drivers, OS layers); and access to read and write caches, CPU and GPU usage, and DRAM utilization. Join us on our journey to better power efficiency as we begin with Part 1: Key Issues in Power Efficiency Measurement. Bookmark this blog and check back in February, March, and April for the continuation of our four-part series. And explore the topic further in the SNIA Green Storage Knowledge Center. Part 1: Key Issues in Power Efficiency Measurement Ensuring accurate and precise power consumption measurements is challenging, especially at the individual device level, where even minor variations can have a significant impact. Achieving reliable data necessitates addressing factors like calibration, sensor quality, and noise reduction. Furthermore, varying workloads in systems require careful consideration to accurately capture transient power spikes and average power consumption. Modern systems are composed of interconnected components that affect each other’s power consumption, making it difficult to isolate individual component power usage. The act of measuring power itself consumes energy, creating a trade-off between measurement accuracy and the disturbance caused by measurement equipment. To address this, it’s important to minimize measurement overheads while still obtaining meaningful data. Environmental factors such as temperature, humidity, and airflow, can unpredictably influence power consumption, emphasizing the need for standardized test environments. Rapid workload changes can lead to transient power behavior that may require specialized equipment for accurate measurement. Software running on a system significantly influences power consumption, emphasizing the importance of selecting representative workloads and ensuring consistent software setups across measurements. Dynamic voltage and frequency scaling are used in many systems to optimize power consumption, and understanding their effects under different conditions is crucial. Correctly interpreting raw power consumption data is essential to draw meaningful conclusions about efficiency. This requires statistical analysis and context-specific considerations. Real-world variability, stemming from manufacturing differences, component aging, and user behavior, must also be taken into account in realistic assessments. Addressing these challenges necessitates a combination of precise measurement equipment, thoughtful experimental design, and a deep understanding of the system and device being investigated. In our next blog, Part 2, we will examine the impact of workloads on power efficiency measurement. The post Power Efficiency Measurement – Our Experts Make It Clear – Part 1 first appeared on SNIA Compute, Memory and Storage Blog.

Olivia Rhye

Product Manager, SNIA

Find a similar article by tags

Leave a Reply

Comments

Name

Email Adress

Website

Save my name, email, and website in this browser for the next time I comment.

Subscribe to