Sorry, you need to enable JavaScript to visit this website.

SNIA Developer Conference September 15-17, 2025 | Santa Clara, CA

Enhancing Defect Triaging in Storage Systems Using Generative AI from Integration Test-Based Knowledge Graphs

Abstract

Bug detection and triaging in complex storage systems pose unique challenges that distinguish them from general-purpose or SaaS-based software. Unlike conventional code which largely operates in a straightforward user space, storage solutions must seamlessly integrate with the operating system kernel, device drivers, and underlying hardware devices. This tight coupling introduces additional complexity in logging, concurrency, and operational flow. For instance, storage systems often span hundreds of threads and processes, each writing into shared log files without conventional transactional guarantees. Such intricate interactions make it difficult for existing AI-based bug-tracking solutions which are typically trained on general codebases to deliver effective results.  To address these limitations, we propose a novel approach that supplements the system code with knowledge extracted from high-level integration test cases. These tests, often written in human-readable scripting languages such as Python, capture end-to-end system behavior more effectively than narrowly focused unit tests. By converting the insights from integration tests into a structured knowledge graph, our methodology provides an AI bug-triaging agent with rich contextual understanding of system interactions, inter-process communications, and hardware events. This deeper, scenario-driven perspective empowers the agent to pinpoint and diagnose issues from storage system failures that would otherwise be hidden in the labyrinth of kernel-mode calls, user-mode processes, and low-level device drivers. Our early findings suggest that this targeted fusion of code analysis and integration-test-based knowledge significantly enhances both the speed and accuracy of bug identification in storage software an advancement poised to transform how complex system bugs are tracked and resolved.

Learning Objectives

Understand the Unique Challenges of Bug Detection in Storage Systems: Learn the complexities of bug detection and triaging in storage systems, including the integration with the operating system kernel, device drivers, and hardware, and how these differ from general-purpose software. Explore the Limitations of Traditional AI-Based Bug-Tracking Solutions: Gain insight into why existing AI-based bug-tracking systems, typically trained on general codebases, are ineffective in addressing the unique challenges posed by storage solutions. Learn How Integration Test Cases Enhance Bug Detection: Discover how high-level integration tests, written in human-readable scripting languages like Python, provide a more effective means of capturing end-to-end system behavior compared to unit tests. Understand the Role of Knowledge Graphs in Bug Triaging: Learn how converting insights from integration tests into structured knowledge graphs helps AI bug-triaging agents develop a deeper contextual understanding of system interactions, inter-process communications, and hardware events. Evaluate the Impact of AI-Driven Bug Detection in Complex Systems: Understand how the fusion of code analysis and integration-test-based knowledge significantly improves the speed and accuracy of bug identification, and explore the potential for transforming bug tracking and resolution in complex storage software systems.