What happens to hash functions when the range changes? Can we avoid chaos?
The paper shows the existence of such a “consistent hash function family” with the below properties.
Consistency Properties
Here, the buckets are the total range of the hash function (i.e. the modulus). This can increase to accommodate load.
Here, a view consists of a subset of buckets. As the number of buckets increase, pre-existing knowledge about buckets naturally become subsets of the system.
Balance
– The hash function hashes items within a view with approximately uniform probability.
Monotonicity
– If a view expands and an item hashes to a bucket contained in the previous view, they are hashed to the same bucket.
– (Items are either hashed to a new bucket or stay in the old one.)
Spread
– Across substantial views, items hash to approximately the same bucket (i.e. minimally distinct).
Load
– Across substantial views, there’s a reasonable limit to items hashed to each bucket.