Abstract
Modern CPU and NVMe technologies have made a breakthrough and have opened up new horizons for Software-Defined Storage performance. However, not everyone is aware of the new SDS performance potential or considers it unreachable, so current software solutions are not ready for the new hardware capabilities. We want to share our knowledge about developing an SDS engine that achieves 25 million IOPS per one storage node. We will describe challenges we met and resolved while developing a universal engine for the Linux kernel and for the SPDK. We will show performance optimizations and CPU-friendly hints, and discuss how to avoid mistakes when developing all-flash storage, how not to waste millions of IOPS, and how to reach new hardware limits.