Abstract
NoSQL Databases are gaining traction in various industry verticals, e.g. web applications, analytics, banking etc., replacing relational database deployments for reasons like scalability and flexibility [1]. These databases are typically deployed on commodity, direct-attached storage devices and bundle up value adds like replication, compression, cluster management in their software. While these databases are great at serving queries at scale, they lack some enterprise features such as backup and recovery. As part of this talk, we intend to delve into what data and storage management opportunities exist for such databases from shared storage perspective. While shared storage can offer easier storage management through consolidation and independent scaling of storage, we plan to discuss challenges that need to be addressed:
- Implications of the data layouts of NoSQL DB on data management features such as storage efficiency. NoSQL databases perform inline compression and encryption, making it harder to add storage level value adds like deduplication.
- Implications of eventual consistency semantics of NoSQL DBs on taking application-consistent storage backups. Moreover, given their scale and deployment on commodity nodes, quiescing the DB cluster during backup, is not feasible.
- Implications of frequent cluster topology changes on taking cluster consistent storage snapshots.