Nov 9, 2020
Oct 21, 2020
Oct 20, 2020
Everyone is looking to squeeze more efficiency from storage. That’s why the
SNIA Networking Storage Forum hosted a live webcast last month “Compression: Putting the Squeeze on Storage.” The audience asked many great questions on compression techniques. Here are answers from our expert presenters, John Kim and Brian Will:
Q. When multiple unrelated entities are likely to compress the data, how do they understand that the data is already compressed and so skip the compression?
A. Often they can tell from the file extension or header that the file has already been compressed. Otherwise each entity that wants to compress the data will try to compress it and then discard the results if it makes the file larger (because it was already compressed).
Q. I’m curious about storage efficiency of data reduction techniques (compression/ thin provisioning etc) on certain database/server workloads which end up being more of a hindrance. Ex: Oracle ASM, which does not perform very well under any form of storage efficiency method. In such scenarios, what would be the recommendation to ensure storage is judiciously utilized?
A. Compression works well for some databases but not others, depending both on how much data repetition occurs within the database and how the database tables are structured. Database compression can be done on the row, column or page level, depending on the method and the database structure. Thin provisioning generally works best if multiple applications using the storage system (such as the database application) want to reserve or allocate more space than it actually needs. If your database system does not like the use of external (storage-based, OS-based, or file system-based) space efficiency techniques, you should check if it supports its own internal compression options.
Q. What is a DPU?
A. A DPU is a data processing unit that specializes in moving, analyzing and processing data as it moves in and out of servers, storage, or other devices. DPUs usually combine network interface card (NIC) functionality with programmable CPU and/or FPGA cores. Some possible DPU functions include packet forwarding, encryption/decryption, data compression/decompression, storage virtualization/acceleration, executing SDN policies, running a firewall agent, etc.
Q. What's the difference between compression versus compaction?
A. Compression replaces repeated data with either shorter symbols or pointers that represent the original data but take up less space. Compaction eliminates empty space between blocks or inside of files, often by moving real data closer together. For example, if you store multiple 4KB chunks of data in a storage system that uses 32KB blocks, the default storage solution might consume one 32KB storage block for each 4KB of data. Compaction could put 5 to 8 of those 4KB data chunks into one 32KB storage block to recover wasted free space.
Q. Is data encryption at odds with data compression? That is, is data encryption a problem for data compression?
A.If you encrypt data first, it usually makes compression of the encrypted data difficult or impossible, depending on the encryption algorithm. (A simple substitution cypher would still allow compression but wouldn't be very secure.) In most cases, the answer is to first compress the data then encrypt it. Going the other way, the reverse process is to first decrypt the data then decompress it.
Q. How do we choose the binary form code 00, 01, 101, 110, etc?
A. These will be used as the final symbol representations written into the output data stream. The table represented in the presentation is only illustrative, the algorithm document in the deflate RFC is a complete algorithm to represent symbols in a compacted binary form.
Q. Is there a resource for different algorithms vs CPU requirements vs compression ratios?
A. A good resource to see the cost versus ratio trade-offs with different algorithms is on GItHub here. This utility covers a wide range of compression algorithms, implementations and levels. The data shown on their GitHub location is benchmarked against the silesia corpus which represents a number of different data sets.
Q. Do these operations occur on individual data blocks, or is this across the entire compression job?
A. Assuming you mean the compression operations, it typically occurs across multiple data blocks in the compression window. The compression window almost always spans more than one data block but usually does not span the entire file or disk/SSD, unless it's a small file.
Q. How do we guarantee that important information is not lost during the lossy compression?
A. Lossy compression is not my current area of expertise but there is a significant area of information theory called Rate-distortion theory which is used for quantification of images for compression, that may be of interest. In addition, lossy compression is typically only used for files/data where it's known the users of that data can tolerate the data loss, such as images or video. The user or application can typically adjust the compression ratio to ensure an acceptable level of data loss.
Q. Do you see any advantage in performing the compression on the same CPU controller that is managing the flash (running the FTL, etc.)?
A.There may be cache benefits from running compression and flash on the same CPU depending on the size of transactions. If the CPU is on the SSD controller itself, running compression there could offload the work from the main system CPU, allowing it to spend more cycles running applications instead of doing compression/decompression.
Q. Before compressing data, is there a method to check if the data is good to be compressed?
A.Some compression systems can run a quick scan of a file to estimate the likely compression ratio. Other systems look at the extension and/or header of the file and skip attempts to compress it if it looks like it's already compressed, such as most image and video files. Another solution is to actually attempt to compress the file and then discard the compressed version if it's larger than the original file.
Q. If we were to compress on a storage device (SSD) what do you think are the topic challenges? Error propagation? Latency/QoS or other?
A. Compressing on a storage device could mean higher latency for the storage device, both when writing files (if compression is inline) or when reading files back (as they are decompressed). But it's likely this latency would otherwise exist somewhere else in the system if the files were being compressed and decompressed somewhere other than on the storage device. Compressing (and decompressing) on the storage device means the data will be transmitted to (and from) the storage while uncompressed, which could consume more bandwidth. If an SSD is doing post compression (i.e. compression after the file is stored and not inline as the file is being stored), it would likely cause more wear on the SSD because each file is written twice.
Q. Are all these CPU-based compression analyses?
A. Yes these are CPU-based compression analyses.
Q. Can you please characterize the performance difference between, say LZ4 and Deflate in terms of microseconds or nanoseconds?
A. Extrapolating from the data available here, an 8KB request using LZ4 fast level 3 (lz4fast 1.9.2 -3) would take 9.78 usec for compression and 1.85 usec for decompression. While using zlib level 1 for an 8KB request compression takes 68.8 usec while decompression will take 21.39 usec. Another aspect to note it that at while LZ4 fast level 3 takes significantly less time, the compression ratio is 50.52% while zlib level 1 is 36.45%, showing that better compression ratios can have a significant cost.
Q. How important is the compression ratio when you are using specialty products?
A. The compression ratio is a very important result for any compression algorithm or implementation.
Q. In slide #15, how do we choose the binary code form for the characters?
A. The binary code form in this example is entirely controlled by the frequency of occurrence of the symbol within the data stream. The higher the symbol frequency the shorter the binary code assigned. The algorithm used here is just for illustrative purposes and would not be used (at least in this manner) in a standard. Huffman Encoding in DEFLATE. Here is a good example of a defined encoding algorithm.
This webcast was part of a SNIA NSF series on data reduction. Please check out the other two sessions:
Oct 20, 2020
Oct 19, 2020
Oct 15, 2020
A few weeks ago, SNIA EMEA hosted a webcast to introduce the concept of RAID on CPU. The invited experts, Fausto Vaninetti from Cisco, and Igor Konopko from Intel, provided fascinating insights into this exciting new technology.
The webcast created a huge amount of interest and generated a host of follow-up questions which our experts have addressed below. If you missed the live event “RAID on CPU: RAID for NVMe SSDs without a RAID Controller Card” you can watch it on-demand.
Q. Why not RAID 6?
A. RAID on CPU is a new technology. Current support is for the most-used RAID levels for now, considering this is for servers not disk arrays. RAID 5 is primary parity RAID level for NVMe with 1 drive failure due to lower AFRs and faster rebuilds.
Q. Is the XOR for RAID 5 done in Software?
A.Yes, it is done in software on some cores of the Xeon CPU.
Q. Which generation of Intel CPUs support VROC?
A. All Intel Xeon Scalable Processors, starting with Generation 1 and continuing through with new CPU launches support VROC.
Q. How much CPU performance is used by the VROC implementation?
A. It depends on OS, Workload, and RAID Level. In Linux, Intel VROC is a kernel storage stack and not directly tied to specific cores, allowing it to scale based on the IO demand to the storage subsystem. This allows performance to scale as the number of NVMe drives attached increases. Under lighter workloads, Intel VROC and HBAs have similar CPU consumption. Under heavier workloads, the CPU consumption increases for Intel VROC, but so does the performance (IOPS/bandwidth), while the HBA hits a bottleneck (i.e. limited scaling). In Windows, CPU consumption can be higher, and performance does not scale as well due to differences in the storage stack implementation.
Q. Why do we only see VROC on Supermicro & Intel servers? The others don't have technology or have they preferred not to implement it?
A. This is not correct. There are more vendors supporting VROC than Supermicro and Intel itself. For example, Cisco is fully behind this technology and has a key-less implementation across its UCS B and C portfolio. New designs with VROC are typically tied to new CPU/platform launches from Intel, so keep an eye on your preferred platform providers as new platforms are launched.
Q. Are there plans from VROC to have an NVMe Target Implementation to connect external hosts?
A. Yes, VROC can be included in an NVMe-oF target. While not the primary use case for VROC, it will work. We are exploring this with customers to understand gaps and additional features to make VROC a better fit.
Q. Are there limitations for dual CPU configurations or must the VROC be configured for single CPU?
A. VROC can be enabled on dual CPU servers as well as single CPU servers. The consideration to keep in mind is that a RAID volume spanning multiple CPUs could see reduced performance, so it is not recommended if it can be avoided.
Q. I suggest having a key or explaining what X16 PCIe means in diagrams? It does mean the memory, right?
A. PCIe x16 indicates the specific PCIe bus implementation with 16 lanes.
Q. Do you have maximum performance results (IOPS, random read) of VROC on 24 NMVe devices?
A. This webinar presented some performance results. If more is needed, please contact your server vendor. Some additional performance results can be found at www.intel.com/vroc in the Support and Documentation Section at the bottom.
Q. I see "tps" and "ops" and "IOPs" within the presentation. Are they all the same? Transactions Per Second = Operations Per Second = I/O operations per second?
A. No, they are not the same. I/O operations are closer to storage concepts, transactions per second are closer to application concepts.
Q. I see the performance of Random Read is not scaling 4 times (2.5M) of pass-thru in case of windows (952K), whereas in Linux it is scaling (2.5M). What could be the reason for such low performance?
A. Due to differences in the way operating systems work, Linux is offering the best performance so far.
Q. Is there an example of using VROC for VMware (ESXi)?
A. VROC RAID is not supported for ESXi, but VMD is supported for robust attachment of NVMe SSDs with hot-plug/LED
Q. How do you protect RAID-1 data integrity if a power loss happened after only one drive is updated?
A. For RAID1 schema, you can read your data when a single drive is written. With RAID5 you need multiple drives available to rebuild your data.
Q. Where can I learn more about the VROC IC option?
A. Contact your server vendor or Intel representatives.
Q. In the last two slides, the MySQL and MongoDB configs, is the OS / boot protected? Is Boot on the SSDs or other drive(s)?
A. In this case boot was on a separate device and was not protected, but only because this was a test server. Intel VROC does support bootable RAID, so RAID1 redundant boot can be applied to the OS. This means on 1 platform, Intel VROC can support RAID1 for boot and other separate data RAID sets.
Q. Does this VROC need separate OS Drivers or do they have inbox support (for both Linux and Windows)?
A. There is inbox support in Linux and to get the latest features, the recommendation remains to use latest available OS releases. In some cases, a Linux OS Driver is provided for older OS releases to backport. In Windows, everything is delivered through OS Driver package.
Q. 1. Is Intel VMD a 'hardware' feature to newer XEON chips? 2. If VMD is software, can it be installed into existing servers? 3. If VMD is on a server today, can VROC be added to an existing server?
A. VMD is a prerequisite for VROC and is a hardware feature of the CPU along with relevant UEFI and OS drivers. VMD is possible on Intel Xeon Scalable Processors, but it also needs to be enabled by the server's motherboard and its firmware. It’s best to talk to your server vendor.
Q. In traditional spinning rust RAID, drive failure is essentially random (chance increases based on power on hours); with SSDs, failure is not mechanical and is ultimately based on lifetime utilization/NAND cells wearing out. How does VROC or RAID on CPU in general handle wear leveling to ensure that a given disk group doesn't experience multiple SSD failures at roughly the same time?
A. In general server vendors have a way to show the wear level for supported SSDs and that can help in this respect.
Q. Any reasons for not using caching on Optane memory instead of Optane SSD?
A. Using Optane Persistent Memory Module is a use case that will be expanded to over time. The current caching implementation requires a block device, so using an Intel SSD was the more direct use case.
Q. Wouldn't the need to add 2x Optane drives negate the economic benefit of VROC vs hardware RAID?
A. It depends on use cases. Clearly there is a cost associated to adding Optane in the mix. In some cases, only 2x 100GB Intel Optane SSDs are needed, which is still economical.
Q. Does VROC require Platinum processor? Does Gold/Silver processors support VROC?
A. Intel VROC & VMD are supported across the Intel Xeon Scalable Processor product skus (bronze-platinum) as well as other product families such as Intel Xeon-D and Intel Xeon-W.
Q. Which NVMe spec is VROC complying to?
A. NVMe 1.4
Q. WHC is disabled by default. When should it be enabled? After write fault happened? or enabled before IO operation?
A. WHC should be enabled before you start writing data to your volumes.
It can be required for critical data where data corrupted cannot be tolerated
in any circumstance.
Q. Which vendors offer Intel VROC with their systems?
A. Multiple vendors as of today, but the specifics of implementation, licensing and integrated management options may differ.
Q. Is VROC available today?
A. Yes, launched in 2017
Q. Is there a difference in performance between the processor categories? Platinum, gold and Silver have the same benefits?
A. Different processor categories have performance differences by themself. VROC is not different across those CPUs.
Q. In a dual CPU config and there is an issue with the VMD on one processor, is there any protection?
A. This depends on how the devices are connected. SSDs could be connected to different VMDs and in RAID1 arrays to offer protection. However, VMD is a HW feature of the PCIe lanes and is not a common failure scenario.
Q. How many PCI lanes on the CPU can be used to NVMe drive usage, and have the Intel CPU's enough PCI lanes?
A. All CPU lanes on Intel Xeon Scalable Processors are VMD capable, but the actual lanes available for direct NVMe SSD connection depends on server's motherboard design so it is not the same for all vendors. In general, consider that 50% of PCIe lanes on CPU can be used to connect NVMe SSD drive.
Q. What is the advantage of Intel VMD/VROC over Storage Spaces (which is built in SWRAID solution in Windows)?
A. VROC supports both Linux and Windows and has a pre-OS component to offer bootable RAID.
Q. If I understand correctly, Intel VROC is hybrid raid, does it require any OS utility like mdadm to manage array on linux host?
A. VROC configuration is achieved in many ways, including Intel GUI or CLI tool. In Linux, the mdadm OS utility is used to manage the RAID arrays
Q. Will you go over matrix raid? Curious about that one.
A. Matrix RAID is about multiple RAID levels configurable on common disks, if space is available. Example: (4) Disk RAID 10 at 1TB and RAID 5 using remaining space on same (4) disks.
Q. I see a
point saying VROC had better performance…
Are there any VROC Performance Metrics (say 4K RR/RW IOPS and 1M Seq
Reads/Write) available with Intel NVMe drives? Any comparison with any SWRAID
or HBA RAID Solutions?
A. For comparisons, it is best to refer to specific server vendors since they are not all the same. Some generic performance comparisons can be found an www.intel.com/vroc in the Support and Documentation section at the bottom.
Q. Which Linux Kernel & Windows version supports VROC?
A. VROC has an interoperability matrix posted on the web at this link: https://www.intel.com/content/dam/support/us/en/documents/memory-and-storage/ssd-software/Intel_VROC_Supported_Configs_6-3.pdf
Q. Does VROC support JBOD to be used with software defined storage? Can we create a RAID1 for boot and a jbod for vsan or Microsoft for example?
A. Yes, these use cases are possible.
Q. Which ESXi supports VMD (6.5 or 7 or both)? Any forecast for supporting VROC in future releases?
A. ESXi supports VMD starting at version 6.5U2 and continues forward with 6.7 and 7.0 releases.
Q. Can VMD support VROC with more than 4 drives?
A. VROC can support up to 48 NVMe SSDs per platform
Q. What is the maximum no. of drives supported by a single VMD domain?
A. Today a VMD domain has 16 PCIe lanes so 4 NVMe SSD drives are supported as direct attached per domain. If switches are used 24 NVMe SSDs can be attached to one VMD domain.
Q. Does VROC use any Caching mechanisms either through the Firmware or the OS Driver?
A. No caching in VROC today, considered as a future option.
Q. How does VROC close RAID5 write hole?
A. Intel VROC uses a journaling mechanism to track in flight writes and log them using the Power Loss Imminent feature of the RAID member SSDs. In case of a double fault scenario that could cause a RAID Write Hole corruption scenario, VROC uses these journal logs to prevent any data corruption and rebuild after reboot.
Q. So the RAID is part of VMD? or VMD is only used to led and hot-plug ?
A. VMD is prerequisite to VROC so it is a key element. In simple terms, VROC is the RAID capability, all the rest is VMD.
Q. What does LED mean?
A. Light Emitting Diode
Q. What is the maximum no. of NVMe SSDs that are supported by Intel VROC at a time?
A. That number would be 48 but you need to ask your server vendor since the motherboard needs to be capable of that.
Q. Definition of VMD domain?
A. A VMD domain can be described as a CPU-integrated End-Point to manage PCIe/NVMe SSDs. VMD stands for Volume Management Device.
Q. Does VROC also support esx as bootable device?
A. No, ESXi is not supported by VROC, but VMD is. In future releases, ESXi VMD functionality may add some RAID capabilities.
Q. Which are the Intel CPU Models that supports VMD & VROC?
A. All Intel Xeon Scalable Processors
Q. Is Intel VMD present on all Intel CPUs by default?
A. Intel Xeon Scalable Processors are required. But you also need to have the support on the server's motherboard.
Q. How is Software RAID (which uses system CPU) different than CPU RAID used for NVMe?
A. With software RAID we intended a RAID mechanism that kicks-in after the operating system has booted. Some vendors use the term SW RAID in a different way. CPU RAID for NVMe is a function of the CPU, rather than the OS, and also includes Pre-OS/BIOS/Platform components.
Q. I have been interested in VMD/VROC since it was introduced to me by Intel in 2017 with Intel Scalable Xeon (Purley) and the vendor I worked with then, Huawei, and now Dell Technologies has never adopted it into an offering. Why? What are the implementation impediments, the cost/benefit, and vendor resistance impacting wider adoption?
A. Different server vendors decide what technologies they are willing to support and with which priority. Today multiple server vendors are supporting VROC, but not all of them.
Q. What's the UBER (unrecoverable bit error rate) for NVMe drives? Same as SATA (10^-14), or SAS (10^-16), or other? (since we were comparing them - and it will be important for RAID implementations)
A. UBER is not influenced by VROC at all. In general UBER for SATA SSD is very similar to NVMe SSD.
Q. Can we get some more information or examples of Hybrid RAID. How is it exactly different from SWRAID?
A. In our description, SW RAID requires the OS to be operational before RAID can work. With hybrid RAID, this is not the case. Also, hybrid RAID has a HW component that acts similar to an HBA, in this case that is VMD. SW RAID does not have this isolation.
Oct 12, 2020
It's well known that data is often considered less secure while in motion, particularly across public networks, and attackers are finding increasingly innovative ways to snoop on and compromise data in flight. But risks can be mitigated with foresight and planning. So how do you adequately protect data in transit? It’s the next topic the SNIA Networking Storage Forum (NSF) will tackle as part of our Storage Networking Security Webcast Series. Join us October 28, 2020 for our live webcast Securing Data in Transit.
In this webcast, we'll cover what the threats are to your data as it's transmitted, how attackers can interfere with data along its journey, and methods of putting effective protection measures in place for data in transit. We’ll discuss:
Register today and join us on a journey to provide safe passage for your data.
Oct 12, 2020
Oct 7, 2020
As explained in our webcast on Data Reduction, “Everything You Wanted to Know About Storage But Were Too Proud to Ask: Data Reduction,” organizations inevitably store many copies of the same data. Intentionally or inadvertently, users and applications copy and store the same files over and over; with developers, testers and analysts keeping many more copies. And backup programs copy the same or only slightly modified files daily, often to multiple locations and storage devices. It’s not unusual to end up with some data replicated thousands of times, enough to drive storage administrators and managers of IT budgets crazy.
So how do we stop the duplication madness? Join us on November 10, 2020 for a live SNIA Networking Storage Forum (NSF) webcast, “Not Again! Data Deduplication for Storage Systems” where our SNIA experts will discuss how to reduce the number of copies of data that get stored, mirrored, and backed up.
Attend this sanity-saving webcast to learn more about:
Register today (but only once please) for this webcast so you can start saving space and end the extra data replication.
Oct 7, 2020
Leave a Reply