Vector search systems have traditionally treated storage primarily as a persistence layer for vector datasets and index recovery, while relying on DRAM-resident structures for active query execution. As vector datasets continue to scale beyond practical memory limits, modern NVMe storage provides an opportunity to rethink this model by enabling graph-based approximate nearest neighbor (ANN) indexes to actively participate in the online search path rather than serving only as cold storage or index reload media. With its combination of low latency, high parallelism, density, persistence, and cost efficiency, NVMe has become an increasingly viable medium for scalable storage-aware vector search architectures.
This session explores the evolution of DiskANN toward a storage-aware provider architecture for vector search. Instead of tightly coupling vector, graph, and metadata management to a monolithic ANN implementation, the architecture enables indexing and search logic to interact with provider interfaces for vectors, edge lists, quantization data, and metadata services while preserving storage-aware query behavior and NVMe optimization principles.
Particular focus is placed on the continued relevance of NVMe storage for vector search workloads, including graph locality, random-read amplification, tiered storage placement, and cost-efficient scaling beyond DRAM-only architectures. The session also discusses how provider-based integration enables research and production convergence while simplifying adoption across emerging AI and vector database environments.
A proof-of-concept deployment will demonstrate the provider-based architecture operating across multiple storage backends and storage tiers while leveraging NVMe as a foundational component of scalable vector infrastructure. The session will include example indexing workflows, search execution, and integration patterns illustrating how DiskANN can evolve beyond standalone SSD-resident libraries into a flexible storage-aware vector search framework.
Attendees will leave with a practical understanding of storage-aware ANN architecture design, tradeoffs involved in NVMe-based vector indexing, and integration patterns for scalable vector search across modern storage and AI platforms.