Sorry, you need to enable JavaScript to visit this website.

Batched I/O: Reducing Overhead in the Storage Stack

Abstract

Modern NVMe devices can sustain millions of IOPS through the efficient, parallel processing of many queues. Yet the software storage stack still typically processes each I/O request individually - allocating resources, acquiring locks, traversing driver layers, and completing requests one at a time. As device speeds increase, this per-request software overhead becomes a significant factor. We present a new batched I/O mechanism for the Windows storage stack that allows an entire set of I/O operations to be described, dispatched, and completed as a single unit. By amortizing fixed costs across many operations and giving each layer of the stack the opportunity to process work in bulk, batched I/O significantly reduces CPU overhead per operation and improves throughput at high queue depths. This talk will cover the motivation behind the design, the challenges of propagating batched work through a multi-layered driver stack, and early performance results demonstrating the impact on real workloads.