Solving the Challenges of Crash Recoverability in Persistent Memory Programming | SNIA

Abstract

Programs access byte-addressable NVDIMMs using load/store instructions. Because of caching, a store to NVDIMMs does not become persistent until it reaches the memory system. Thus, if parts of updates to a data structure are still in the volatile cache when power fails or system crashes, the cached updates will be lost and the data structure in the memory can be left corrupted or unrecoverable.
The crash recoverability at data structure level is fundamental. Only if data structures are guaranteed crash recoverable can high-level programming support, for example, transaction constructs, be correctly built.
Developers can use cache flushing instructions to force data out of cache to make it persistent. However, it is very challenging and error-prone for developers to reason and identify where cache should be flushed.

We’ll first discuss the challenges and then present a novel approach and a tool for pinpointing in the source where cache flushes are needed.