Abstract
It is well-known that storage cache performance is non-linear in cache size and the benefit of caches varies widely by workload. This means that no two real workload mixes have the same cache behavior! Existing techniques for profiling workloads don’t measure data reuse, nor does they predict changes in performance as cache allocations are varied. Since caches are a scarce resource, workload-aware cache behavior profiling is highly valuable with many applications.
We will describe how to make storage cache analysis efficient enough to be able to put directly into a commercial cache controller. Based on work published at FAST '15, we'll show results including computing miss ratio curves (MRCs) on-line in a high-performance manner (~20 million IO/s on a single core).
The technique enables a large number of use cases in all storage device. These include visibility into cache performance curves for sizing the cache to actual customer workloads, troubleshooting field performance problems, online selection of cache parameters including cache block size and read-ahead strategy to tune the array to actual customer workloads, and dynamic MRC-guided cache partitioning which improve cache hit ratios without adding hardware. Furthermore, the work applies to all types of application caches not just those in enterprise storage systems.
Learning Objectives
Storage cache performance is non-linear
Benefit of caches varies widely by workload mix
Working set size estimates don't work for caching
How to make storage cache analysis available in a commercial cache controller
New use cases for cache analysis in enterprise storage systems