Raft(Replicated and Fault Tolerant) is a consensus algorithm designed to be understandable and implementable while providing the same safety and liveness properties as classical consensus protocols. It solves the problem of replicating a log of commands across a cluster of unreliable machines so they present a single, consistent state machine to clients.
Why Raft?
Raft breaks consensus into clear subproblems — leader election, log replication, safety, and configuration changes — which makes it easier to reason about and implement reliably. It is commonly used in distributed systems that require strong consistency, such as distributed key-value stores, coordination services, and databases.
Core Concepts
- Term: A monotonically increasing logical clock. New elections increment the term. Higher term numbers indicate more recent information.
- Leader / Follower / Candidate: Nodes are followers by default, become candidates to start an election, and a single leader handles client requests and replicates entries.
- Log Entry: Each entry records a command, a term, and an index. The leader appends entries to its log and replicates them to followers.
- Commit: An entry is committed when a majority of nodes have added it to their logs and the leader marks it committed; committed entries are applied to the state machine.
Leader Election (Brief)
- Followers expect periodic heartbeats from the leader. If none arrive within a randomized election timeout, a follower becomes a candidate and starts an election by incrementing its term and requesting votes.
- Nodes grant a vote to the first candidate they see for a term if the candidate’s log is at least as up-to-date as their own.
- A candidate that receives votes from a majority becomes leader and immediately sends heartbeats (AppendEntries) to assert leadership.
- If a node observes a message with a higher term, it updates its term and reverts to follower.
Log Replication (Brief)
- The leader receives client commands, appends them to its log, and sends AppendEntries RPCs to followers.
- Followers append entries only if the previous log index and term match; this ensures logs stay consistent.
- When a leader sees an entry replicated on a majority, it marks the entry committed and tells followers to apply it to their state machines.
- If a follower’s log and leader’s log diverge, the leader finds the matching index/term and overwrites conflicting entries on the follower.
Safety Guarantees
- Raft ensures that any committed entry is present in the logs of future leaders. This is enforced by voting rules that require a candidate’s log to be at least as up-to-date as voters’ logs.
- The combination of terms, majority voting, and careful commit rules prevents divergent committed histories.
Log Compaction and Snapshots
- Logs grow indefinitely; Raft supports snapshotting: compacting applied state into a snapshot and truncating earlier log entries.
- Snapshots can be transferred to slow or new followers to bring them up to date without replaying the entire log.
Membership Changes
- Safe membership change is performed using a two-phase (joint) configuration: first operate in a configuration that includes both old and new members, achieve a majority, then switch to the new configuration. This preserves majority-based safety during transitions.
Practical Tips for Implementations
- Use randomized election timeouts to reduce split votes and election collisions.
- Keep heartbeats frequent relative to the election timeout.
- Monitor metrics: election frequency, replication lag, commit latency, and snapshot/compaction activity.
- Handle persistence carefully: term, voted-for, and log must survive crashes.
- Make RPCs idempotent and design retry/backoff strategies for unstable networks.
When to Choose Raft
Choose Raft when you need strong consistency and clear reasoning about correctness. For eventual consistency or very large-scale, partition-tolerant systems, other architectures may be more appropriate. Raft is a solid choice for systems where ease of implementation, maintainability, and strong correctness guarantees matter.
Further Reading
For a complete specification and proofs, read the original Raft paper and follow-up resources and reference implementations to study edge cases and production-ready details.