SNIA Developer Conference September 15-17, 2025 | Santa Clara, CA
The recent uptick in generative artificial intelligence (GAI) has put the more pressure on hardware vendors to reduce the carbon footprint of running these power hungry large language models (LLM) in the datacenter. One way to accomplish a lower in-silicon power profile is to break the Von-Neumann bottleneck by tightly integrating traditional SRAM memory cells with interleaved programable processors in the same die. We report on our progress in this area, in particular, leveraging recent open research in both mixed precision mathematics and extreme low-bit quantization of deep learning model parameters and activations running in our custom "In-SRAM" processor.
With the complexity of applications increasing every day, the workloads generated by these applications are complicated and hard to replicate in test environments. We propose an efficient method to synthesize a close approximation of these application workloads based on analyzing the historic autosupport data from field using an iterative mechanism and also a method to store and replay these workloads in the test environment for achieving the goals of customer driven testing. Problem Statement: As we align more towards customer driven testing, the quantity and complexity of workloads grows exponentially. Most of our regression testing uses workloads designed for testing the functionality and stressing the array but some of the corner cases or race conditions are only caught using complex customer workloads. It is very difficult, and time consuming to model and synthesize customer workloads whenever there is a need to reproduce any escalations or POCs (proof of concept). The existing IO tools don’t have any direct mechanism to simulate these customer workloads. Some tools do provide capability to capture and replay workloads but only at the host level. They also lack the capability to analyze array stats which is more significant for modelling customer workloads. Solution: In this paper we describe two solutions, viz. Workload Analyzer and Synthesizer (WAS) which analyses the array autosupport (ASUP) data to synthesize customer workloads and the Workload Matrix Solution1 (WMS) to integrate and deploy these synthesized workloads in existing test environments.
Object storage systems provide significant value for storing and managing data. The nature of data stored in object systems opens up opportunities to get more value out of these systems than the common expectations of cost reduction, ease of use, resilience, and durability. Maintaining the metadata for large unstructured data sets is difficult and can be time consuming. The system I propose here is an add on engine that adds Artificial Intelligence inferencing functionality to object storage systems. The engine is notified upon file creation and then invokes common detectors, identifiers, and techniques such as object detection and classification to identify and enhance information about the stored objects that can be indexed and searched. I provide a demonstration of the engine capabilities and an integration example.
The recent uptick in generative artificial intelligence (GAI) has put the more pressure on hardware vendors to reduce the carbon footprint of running these power hungry large language models (LLM) in the datacenter. One way to accomplish a lower in-silicon power profile is to break the Von-Neumann bottleneck by tightly integrating traditional SRAM memory cells with interleaved programable processors in the same die. We report on our progress in this area, in particular, leveraging recent open research in both mixed precision mathematics and extreme low-bit quantization of deep learning model parameters and activations running in our custom "In-SRAM" processor.
Artificial intelligence (AI) systems are creating numerous opportunities and challenges for many facets of society, including both security and privacy. For security, AI is proving to be a power tool for both adversaries and defenders. In addition, AI systems and their associated data have to be defended against a wide range of attacks, some of which are unique to AI. The situation with privacy is similar, but the societal concerns are elevated to a point where laws and regulations are already being enacted. This session explores the AI landscape through the lenses of security and privacy.
This talk is to discuss some of the trends we have observed over time in the Infrastructure space when it comes to addressing AI applications and workloads.
As most of you are aware AI has been evolving at a very rapid pace with adoption evolving from research and niche use cases to encompassing the majority of mainstream consumer, enterprise, telco, cloud, and government use cases. This massive broadening of AI use cases has impacted the direction of infrastructure across the spectrum of silicon, software, systems, and solutions. There are multiple factors that are affecting the long-term view of how we should be thinking about Infrastructure. The key factors are : 1) Applications (Image classification, Object segmentation, NLP, Recommendation, speech 2) Neural network model sizes 3) Power 4) Cooling. This has led to a broad range of Accelerators addressing this space from GPUs, FPGAs to custom ASICs. Since the use cases within AI are broad, implementation of Infrastructure using these compute devices can vary based on deployment model i.e. far-edge, near-edge, on-prem datacenter and cloud. Besides Accelerators we are also seeing how important it is to address the needs for storage and networking since at the end of the day an accelerator is just a fast calculator.
To address some of these factors, we will cover the evolving landscape for Infrastructure and what can a future deployment look like for these AI solutions.