Making Democracy Work: Fixing and Simplifying Egalitarian Paxos
15 days ago
- #paxos
- #fault-tolerance
- #distributed-systems
- Classical state-machine replication protocols like Paxos rely on a leader, creating a single point of failure and latency issues.
- Egalitarian Paxos introduced a leaderless approach, allowing replicas to order commands collaboratively, improving fault tolerance and performance.
- Egalitarian Paxos can maintain non-zero throughput with up to f crashes out of n = 2f+1 processes and allows fast command execution under certain conditions.
- Despite its benefits, Egalitarian Paxos is complex, ambiguously specified, and has bugs.
- EPaxos* is a simpler and correct variant of Egalitarian Paxos with a rigorously proven failure-recovery algorithm.
- EPaxos* generalizes Egalitarian Paxos to cover optimal failure thresholds f and e, with n ≥ max{2e+f-1, 2f+1}.