Data Deduplication as a Platform for Virtualization and High Scale Storage | SNIA

Abstract

The primary data deduplication system in Windows Server 2012 is designed to achieve high deduplication savings at low computational overhead on commodity storage platforms. In this talk, we will build upon that foundational work and present new techniques to scale primary data deduplication on both the primary data serving and optimization pathways. This will include hardware accelerated performance improvements for hashing and compression, better file system integration to reduce write path overheads and optimize live files, and deduplication aware caching to mitigate disk bottlenecks. We will show how this enables deduplication to be leveraged as a platform for storage virtualization.

Learning Objectives

Fundamental building blocks for a primary data deduplication system.
Deduplication data serving for “live data” as a storage layer for virtualization workload.
Optimization of data at high scale and little to zero impact on compute resources of virtualization platform.
Utilizing data deduplication as a means to implement an efficient system cache.