Updated August 2024
Storage security is an essential practice to ensure the confidentiality of the information stored in Information and Communication Technology (ICT) systems. Among these practices, storage sanitization is a crucial method to mitigate the risk of unauthorized data disclosure (data breach). Regarding terms like 'data wiping', 'data deletion', ‘data erasure’, or 'secure data deletion', their meanings can often be vague and misleading. For instance, 'deletion' may merely entail removing a few file system pointers without eliminating the actual data, and 'secure data deletion' might only address currently accessible data copies, leaving other copies intact. Even the term 'data shredding' is often misinterpreted, sometimes referring to a physical shredder or the scrambling of returned data by deleting the encryption key. According to the IEEE Standard for Sanitizing Storage (IEEE 2883-2022), 'storage sanitization' is the only term explicitly defined and widely accepted.
Sanitization: A process or method to render access to target data on storage media infeasible for a given level of effort.
Removing data is not trivial. All copies of the data must be located, all the data must be classified for sensitivity and risk of the data being accessed by an unauthorized party, the data storage technologies that the data is located on (which are designed to guard against data loss) must be identified, and this data removal must be compliant with company or government policies.
Data sanitization targets all instances of stored data, across all copies, wherever the data resides. Storage sanitization focuses on ICT infrastructure that uses non-volatile storage. Often, this data is abstracted away in a complex storage system that uses erasure coding, RAID, or other parity for data protection, as well as other modern storage array features like compression, deduplication, and encryption. Logical sanitization targets this data on logical or virtual storage. Media sanitization targets the data on a specific storage device.
Figure 1 shows the three data sanitization methods discussed in the IEEE 2883-2022 specification.
Clear
Uses logical techniques to remove data on all addressable storage
Prevent against simple non-invasive data recovery
Format, deallocate
Purge
Uses logical or physical techniquest to remove all data
Infeasible data recovery with state of the art techniques
Block erase, crypto erase, overwrite
Destruct
Infeasible data recovery with state of the art techniques
Leaves devide in unusable state
Disintegrate, incinerate, melt
Figure 1. Three Basic Data Sanitization Methods
Following are more details about these three data sanitization methods.
The important element of purge sanitization is that it makes data recovery infeasible with state-of-the-art equipment. Note that data recovery companies often have access to special tools and methods that can be used to help customers recover data from their storage devices, if they are not purged. These data recovery methods include the disassembly of the storage device and connecting various components, like the Head Disk Assembly (HDA, containing the HDDs heads on the voice coil actuator and the disk stack), to a different and operational circuit board. Even more extreme (and expensive) data recovery techniques include the use of electron microscopy, magnetic force microscopy with HDDs and putting NAND die, extracted from an SSD, directly into a tester to read the raw pages.
Purge techniques prevent even the most sophisticated data recovery experts from obtaining any user data. NAND flash is known to provide high performance, low latency, and low power access to data. One of the drawbacks of NAND from an SSD perspective is that the cell has to be erased before it can be programmed. NAND writes in pages, often today around 16 kilobytes, but NAND is erased in blocks - which can be hundreds of pages.
Modern SSDs have to deal with this by using spare memory areas, or overprovisioning to do garbage collection of data during storage operations. For media sanitization, this is a nice feature of NAND flash. This is because multiple block erases can be executed in parallel quickly, leaving no recoverable data. This can be performed in seconds to a few minutes, much faster than overwriting all the blocks with a random data pattern. During this sanitization, it is common to deallocate all the logical blocks, commonly referred to as TRIM, to improve SSD endurance for the next use. SSDs do not need to be overwritten with data to perform a data purge because of this useful function of all NAND Flash technology.
Figure 2 shows how NAND is addressed by die, planes, erase blocks, and pages. SSDs have controllers that can perform operations simultaneously across many different channels, making the purge block erase happen quickly in parallel across many blocks, across many dice.
Figure 2. NAND Flash organization (Source: JEDEC ONFI 5.1 specification)
Self-encrypting drives (SED) can encrypt data at rest. Most modern SSDs use an AES256 Engine to encrypt all data going to the NAND, regardless of a user or host operating system setting a user password. Most vendors will classify a drive as an SED if it supports encryption of data at rest utilizing the Trusted Computing Group Opal specification to set user passwords and locking ranges for the encrypted data [11]. The fact that most drives use the AES256 engine at all times makes purge sanitization using cryptographic erase (CE) practical by only requiring the drive to sanitize the media encryption key. CE can be performed in seconds, and can also be easily verified by reading back data and ensuring that it is random, and not the data that was present before the sanitize command. There are considerations for cryptographic erase being a supported purge sanitize method that is outlined in IEEE 2883-2022 Table B.1[8], including key generation, media encryption, and key wrapping.
Modern cryptography using AES 256 is trusted in the cloud and enterprise systems for data encryption. In a cryptographic erase, sanitization can be performed in seconds by destroying a media encryption key, leaving all the data on the drive encrypted. The effort needed to decrypt the data is dependent on the entropy and strength of the encryption algorithm. NIST considers AES 256 [12] quantum-safe today and in decades to come [13], as it is likely that Grover’s algorithm may provide little advantage to brute force attacks. “Steal now and decrypt later” is an argument against cryptographic erase but should not be used to dissuade the use of purge sanitization using CE based on the current consensus of AES 256 and quantum computing experts.
IEEE 2883-2022 is the latest standard for media sanitization. It defines sanitization methods and techniques for the specific storage media type (HDD, SSD, optical, removable, etc.) and specifies interface-specific techniques (SATA, SAS, NVMe). This specification can align the industry on terminology and modern techniques for media sanitization, targeting all logical and physical locations for data – including user data, old data, metadata and overprovisioning. The three sanitization methods outlined in the specification are clear, purge, and destruct.
Other specifications for the ecosystem of media sanitization will reference the work done in IEEE 2883-2022. ISO/IEC 27040 adds additional requirements for sanitization compliance, including identification of the form of storage, verifying the results, and documenting and producing evidence of storage sanitization. The update to ISO/IEC DIS 27040 (2nd edition) is expected to be published in early 2024.
Most policies require proof that “the data is gone”. There are different media sanitization verification methods outlined in IEEE 2883-2022 and ISO 27040, but the important aspects are detailed below. For purge media sanitization, different verification methods can be used for varying degrees of risk and understanding and validation of the storage hardware and firmware (often done through compliance, conformance testing, or forensic analysis). Some basics tenets of verification that should be considered are:
Media sanitization is critical in advancing the circular economy within the ICT industry. Reusing storage devices can reduce electronic waste (e-waste) and mitigate embedded carbon emissions from manufacturing storage media and devices. Effective media sanitization ensures that data-bearing devices can be safely reused and repurposed rather than discarded and recycled. Circularity in storage includes extending use (life), reuse, refurbishment, remanufacturing, and recycling. Circularity conserves valuable materials and reduces energy consumption and associated greenhouse gas emissions from manufacturing, sometimes called embedded carbon. The lifecycle of ICT products, starting from the extraction of raw materials to manufacturing, usage, and disposal, is associated with significant carbon emissions. The use period of a storage product lifecycle can be extended by facilitating the reuse and transfer of ownership through secure media sanitization, ensuring no confidential data is passed on to the following users.
Purge media sanitization was designed to eliminate almost all risk of data recovery or access, even by sophisticated attackers with high budgets (e.g. a nation-state). Purge utilizes the advancements in cryptography through methods like crypto erase to make sanitization quick, effective, and economical (sanitization and verification can be run over the storage interface, not requiring expensive equipment). Storage security and the risk of data leaking are major reasons why many companies use destruct sanitization today, as they believe it is lower risk. Purge media sanitization with proper verification and adherence eliminates nearly all the risk and leaves the device reusable, making it indispensable in achieving a circular economy.
You can find the Sanitize command, log pages, and theory of operation in the NVM Express Base (NVMe Base) Specification.