Self-contained Information Retention Format (SIRF)

SIRF is a logical container format for the storage subsystem appropriate for the long-term storage of digital information. It is intended for a logical or physical storage area considered as a unit (storage container). For example, a storage container may comprise a mountable data storage unit, a file system, a tape, a block device, a stream device, an object store, a data bucket in a cloud storage. SIRF is self-describing; namely it can be interpreted by different systems and in different points in time. SIRF is also self-contained; namely, all data needed for the preservation objects' interpretation is contained in the container. The metaphor we use is a closed bottle that includes all the information needed to understand the bottle's contents at a future point in time.

SIRF leverages the knowledge of the archival profession and helps archivists remain comfortable with the digital domain. Generally, archivists group physical items into collections and store them in a physical container (e.g. a file folder or an archival box of standard dimensions), and that container will be labeled with some "finding aid" that gives the name and location of the collection, its size, and an overview of its contents. SIRF is the digital equivalent to the physical container - the archival box or file folder - that defines a collection of preservation objects, and that can be labeled with standard information in a defined format to allow retrieval when needed.

The following figure illustrates the components included in the SIRF container: 

  • A magic object that identifies whether this is a SIRF container and gives its version.
  • Preservation objects that are immutable. The container may include multiple versions of a preservation object. 
  • A catalog that is updateable and contains metadata needed to make the container and its preservation objects portable into the future without relying on functions external to the storage subsystem. It contains metadata relating to the entire contents of the container as well as to the individual preservation objects and their relationships.
SIRF enables reducing the cost of preservation, as the preservation processes can be done at a lower level of the system stack and can be performed close to the data using more robust, efficient, and automatic methods. Easier, more efficient preservation processes in turn lead to more scalable and less costly preservation of digital content.

The public review.

The public review.


  • Simona Rabinovici-Cohen, Roger Cummings and Sam Fineberg, "Self-contained Information Retention Format For Future Semantic Interoperability", Proceedings of the 4th International Workshop on Semantic Digital Archives (SDA), September 2014, London, UK
  • Simona Rabinovici-Cohen, Mary G. Baker, Roger Cummings, Sam Fineberg, and John Marberg, "Towards SIRF: Self-contained Information Retention Format”, Proceedings of the Annual International Systems and Storage Conference (SYSTOR), May 30-June 1, 2011, Haifa, Israel


Go back to the SNIA LTR public page