Abstract
Latest trends in software-defined storage push for hyper-convergence where compute, networking and storage are combined within one device. In this talk, we share an update since our last talk at SDC 2016. We present results from our ongoing R&D work on heterogeneous multi-processing where we combine ARM-processors for control plane functionality and dataflow architectures in FPGAs to handle data processing in networked storage systems at 100 GigE line-rates, and beyond. Our proof-of-concept implementation deploys Xilinx Zynq Ultrascale+ MPSoC as a single-chip solution featuring multiple NVMe devices with PCIe Gen3 connectivity and dual 100 GigE networking. To analyze the effects of DRAM, 3DXpoint and NAND Flash in multi-tiered hybrid-memory storage architectures, we present a non-intrusive on-chip performance monitoring solution, implemented in FPGA logic. These monitors give fine-grain insight into latency/bandwidth aspects of dataflow processing and are useful to guide balancing between the different storage technologies and interfaces, when trading-off between capacity, performance and power. These monitors further support evaluating the benefits of software-defined inline processing, such as FPGA-accelerated encryption, video compression, object recognition or deep-learning algorithms without performance penalty.