Dynamic Object Routing

Library Content Type:
Publish Date: 
Tuesday, September 20, 2016
Event Name: 
Focus Areas:

Problem Description: In object store systems objects are identified using a unique identifier. Objects on disk are immutable. Any modification to an existing object, results in new object with different timestamp to be created on disk. A common method to distribute objects to storage nodes is using consistency hashing method. A storage node can have multiple disks to store the objects assigned to it and based on the application usage patterns some hash bucket grow faster than others resulting in some disks getting more used than others.

* Existing solution: - One way to solve this problem it to move this hash bucket to one of the less used disks, which moves data from disk to disk

* Our solution - A routing table is used to determine object's storage location. Object hash value as well as its insertion timestamp is used to determine the object's storage location. Each hash bucket is assigned initially to one of the available disks. When a hash bucket's storage disk utilization is greater than overall average disk utilization, another less used disk is assigned to that hash bucket with a new timestamp. All new objects to that hash bucket will be stored in new disk. Existing objects will be accessed from old disk using the routing table. This method will avoid moving data.