Consistent Hashing
10 hours ago
- #algorithms
- #hashing
- #distributed-systems
- Consistent hashing is an algorithm that minimizes key redistribution when the hash table's size changes.
- Naive hashing causes all keys to be recomputed when the number of nodes changes, leading to inefficiencies.
- Consistent hashing maps nodes and items onto a unit circle, ensuring minimal redistribution when nodes are added or removed.
- The algorithm ensures that only a fraction of keys need to be remapped when the number of nodes changes, improving efficiency.
- Virtual nodes can be used to achieve a more balanced distribution of items across nodes, reducing load imbalance.
- Implementations often use binary search on a sorted array of node positions for efficient lookup.
- The original consistent hashing paper highlights the monotonicity property, ensuring keys move only to new nodes, not between old ones.
- Practical considerations include quantizing the unit circle to a discrete range for computational efficiency.
- The variance in node distribution can be high without virtual nodes, but virtual nodes significantly reduce this variance.
- Consistent hashing is widely used in distributed systems, such as caching proxies and key-value stores like Dynamo.