Consistent Hashing

10 hours ago

Copy Link

Consistent hashing is an algorithm that minimizes key redistribution when the hash table's size changes.
Naive hashing causes all keys to be recomputed when the number of nodes changes, leading to inefficiencies.
Consistent hashing maps nodes and items onto a unit circle, ensuring minimal redistribution when nodes are added or removed.
The algorithm ensures that only a fraction of keys need to be remapped when the number of nodes changes, improving efficiency.
Virtual nodes can be used to achieve a more balanced distribution of items across nodes, reducing load imbalance.
Implementations often use binary search on a sorted array of node positions for efficient lookup.
The original consistent hashing paper highlights the monotonicity property, ensuring keys move only to new nodes, not between old ones.
Practical considerations include quantizing the unit circle to a discrete range for computational efficiency.
The variance in node distribution can be high without virtual nodes, but virtual nodes significantly reduce this variance.
Consistent hashing is widely used in distributed systems, such as caching proxies and key-value stores like Dynamo.

Hasty Briefsbeta