The SNIA Cloud Storage Technologies Community recently hosted a live webinar, Intelligent Data Management: Shaping the Future for AI Workloads,”featuring a deep dive into the data management challenges posed by AI and how the SNIA Cloud Data Management Interface (CDMI™) addresses them through open standards. The session received a high rating from attendees and sparked several questions about CDMI’s role in intelligent data management. Our presenters have answered those questions in this Q&A blog. If you missed the webinar, you can watch the on-demand recording at the SNIA Educational Library.

Q: Is CDMI just for Cloud?

A: No, CDMI is for any system that uses a cloud-style interface, specifically, for any system where resources are identified by URLs. This includes on-prem storage systems, enterprise systems that manage data, and middleware systems that transport and process data. Many applications and systems that use CDMI are in-house enterprise or government systems that require standardized data discovery, data management, rich metadata and data portability, such as electronic health records, banking records, cultural preservation digital assets and scientific computing data managers.

Q: What business or technical problem does CDMI solve?

A: CDMI provides a unified, protocolindependent way to manage data across clouds and onprem systems, addressing fragmentation, inconsistent metadata, and limited portability found in protocolspecific or vendorspecific mechanisms.

Q: How does CDMI differ from other data management interfaces? 

A: Some data management interfaces are specific to data storage applications such as profitability management. Other interfaces are specific to technologies such as configuring access to analytical tools. Others are specific to roles such as master data management (MDM). Yet other data management interfaces focus on creating new data objects by combining other data object (Tibco’s UMD). 

Data management interfaces have traditionally been tightly coupled to specific storage protocols and systems. For example, you use database management protocols for tables, bucket management interfaces for S3, and storage APIs for managing file systems.

With the introduction of cloud services, it rapidly became clear that this approach would result in significant gaps, as cloud services resulted in growing requirements for data mobility between providers (as well as with on-prem systems), and increased service diversity resulted in growing requirements for data portability between services. In addition, data management functions within a given protocol can only express the concepts defined within that protocol, so it was clear to the group of storage vendors that founded the SNIA Cloud Storage Technical Work Group that an "overlay" management protocol was required that operated outside of each of the different service protocols and service providers.

As a result, CDMI allows any data resource accessible in a cloud (and on-prem) to be managed in a consistent way: You start with a URL to the data, and use CDMI to manage that data in a consistent and standard way, regardless of the format of the data, the location of the data, the protocol(s) being used to access the data and the service providing access. This is a very powerful concept because it makes CDMI protocol-independent and service-independent. 

This independence also lets CDMI realize some of its core value: In discovery, because CDMI exists outside any given data access protocol, you can identify available protocols and specify access protocols, something that can't be done within any specific data access protocol. Likewise, CDMI allows access to a superset of metadata, which aids in data portability. This would be extremely difficult to do within a data access protocol, since it would limit what is expressible.

This is the biggest distinguishing characteristic of CDMI, and what makes it different compared to other data management capabilities and standards. It sits at a level above these protocols and associated concepts, which allows it to provide cross protocol abstractions and portability.

Q: Why is CDMI better than other proprietary solutions?

CDMI is an open, vendor-neutral interface standard (ISO/IEC 17826), not a proprietary product or ad-hoc specification. As an open standard, anyone can implement, extend, and build upon it without licensing fees, vendor approval, or risk of unilateral deprecation. This openness drives a competitive ecosystem of interoperable implementations, giving organizations real choice in tooling and suppliers rather than a forced dependency on a single vendor's roadmap.

Most management frameworks are domain-specific and tightly bound to a single cloud provider. This creates operational gaps when organizations need to manage resources outside those services, deploy on-prem subsets of cloud functionality, or coordinate across multiple providers. The result is duplicated tooling, inconsistent workflows, and increased operational overhead—costs that compound as infrastructure scales. Vendor lock-in also carries strategic risk: pricing changes, service discontinuation, or shifts in a vendor's product direction can force costly re-engineering with little notice.

CDMI eliminates these gaps by providing a consistent management interface regardless of where workloads run. Its vendor-neutral design means organizations are not penalized for infrastructure decisions—adding a new cloud provider, repatriating workloads on-prem, or executing a multi-cloud strategy all operate under the same management model. This consistency delivers measurable operational gains: reduced integration complexity, lower staff retraining costs when environments change, simplified compliance and audit workflows across heterogeneous infrastructure, and faster execution of data mobility and portability tasks.

Because CDMI is governed as an ISO standard rather than a product, its evolution is driven by broad industry consensus rather than a single vendor's commercial interests. Organizations that adopt it invest in a stable, long-lived interface with a clear governance path—one that remains portable across the industry rather than tied to any one platform. This makes CDMI a sound foundation for data center strategy, HPC storage management, and hybrid cloud operations where longevity, interoperability, and operational continuity are priorities.

Q:  How does CDMI help with AI applications and use cases? 

A: There are quite a few areas where CDMI is very well aligned to solve emerging AI data management challenges. For example, in the EU, GDPR Article 20, the "right to data portability," plus the EU Data Act, require AI systems to provide access to all stored data associated with a user's AI services. CDMI provides a standard and low-effort approach to allow providers to remove technical barriers to data migration and enable interoperability by allowing for rich self-describing metadata to be directly attached to stored data items, as well as providing a standardized data transfer and serialization format.

Another area where CDMI's rich metadata capabilities help includes the new graph relationships metadata work that is being added to CDMI 3.0, providing the ability to transport extracted metadata and knowledge graphs together with the documents and document fragments stored in AI data repositories.

And finally, CDMI helps AI frameworks discover the best way to access data. An AI application can ask a CDMI-enabled system to indicate the best way and source to access data, and CDMI indicates which protocols, which accelerations are enabled, and which locations are optimal for data access.

Q: Who should consider implementing CDMI?

A: Organizations needing interoperability, multicloud portability, AIready metadata, or compliancedriven data mobility (e.g., GDPR Article 20) benefit most.

Q: What is new in CDMI 3.0?

A: CDMI 3.0 introduces expanded support for multiprotocol data access, improved namespace discovery, enhanced metadata (including graph relationships), and better support for AIdriven use cases.

Q: How does CDMI support multi-protocol environments?

A: CDMI operates as a management protocol, independently of data access protocols, and can identify which protocols and accelerations are available for optimal data access. This allows CDMI to add significant value to existing data access protocols such as NFS, CIFS, S3, etc.

Q: How does CDMI help with AI applications and workloads? 

A: CDMI helps address AI data management challenges by providing a standardized way to attach rich, selfdescribing metadata to stored data, which supports regulatory requirements such as GDPR Article 20 and enables seamless data portability across systems. It also allows AI workflows to transport metadata and knowledgegraph relationships alongside documents and fragments, a capability expanded further in CDMI 3.0. Finally, CDMI enables AI frameworks to discover the most efficient access paths—identifying optimal protocols, accelerations, and locations for retrieving data.

Q: What kind of metadata does CDMI support, and why does this matter?

A: CDMI supports structured metadata in hierarchical organization (and with CDMI 3.0, in graph organizations), with all metadata items expressed as strings to allow for support beyond the limitations of JSON. This allows any text, numeric and binary data to be supported. CDMI 3.0 graph support also allow this metadata to become self-describing to facilitate use by agentic systems and other AI applications. This allows applications to use CDMI as a superset format to store and transport metadata as well as data without loss, and when moved between systems, further reduces data loss and improves metadata conversion accuracy.

Q: Does CDMI replace existing cloud provider interfaces?

A: No. CDMI functions as an overlay that coexists with vendor APIs, enabling management across services rather than replacing perprotocol APIs.

Q: What are the most common CDMI use cases?

A: The most common use cases are: 

Data discovery

  • Cross-protocol data access management
  • Multi-cloud/cross-systems data portability 
  • Cross cloud or on prem/cloud migration 
  • Compliance driven data extraction and transfer workflows

We anticipate that AI data catalogs and knowledge graph portability will become increasingly common CDMI use cases.

Q: How does CDMI support data lifecycle management across clouds and on-prem?

A: CDMI enables consistent management operations—discovery, metadata association, and optimized access—across environments, supporting movement and lifecycle governance of data resources.

Q: Where can I learn more about CDMI? 

A: CDMI 3.0 is under development with a target completion at the end of 2026. Plans for this next revision of CDMI are detailed in the Introduction to CDMI 3.0 White Paper. Now is a great time to get involved in working with us on CDMI 3.0. If you are interested, contact the SNIA Cloud Storage Technical Work Group Chair, David Slik at cloudtwg-chair@snia.org