Building on Storage for AI 103, which introduced why Inference needs KV cache (context memory storage) and storage for KV cache, this presentation will go deeper into the key types of storage used in Inference and how the workloads of KV cache and RAG with vector db search or RAG with graph neural network search leads to three different types of workloads with different requirements for the backing storage.