In this lecture we show how to use the
failure detector to obtain a protocol
for consensus in message-passing asynchronous systems which
tolerates up to f < n/2 crash failures.
It was left as an exercise
to show how to attain strong completeness using failure
detectors satisfying only weak completeness . Therefore it suffices to
provide a protocol using
failure detectors.
The idea is as follows.
The protocol runs in asynchronous rounds, where in round r,
with the exception of a special DECIDE message,
only messages timestamped with r are sent and processed.
Throughout the execution of the protocol each process p
has an opinion consisting of a bit
.The initial opinion is given by the input bit.
Each opinion has a time of adoption denoted
by t(p) whose initial value is 0.
Both opinion and time of adoption can change during the execution of the
protocol.
During each round there is a (unique) coordinator
which is simply the process with ID number equal to
(n denotes the number of processes).
Notice that the coordinator rotates among the processes.
Notice also that for this to work the processes must agree beforehand
on a consistent numbering among them.
A round r consists of four different phases.
During phase 1, each process, including the coordinator, sends his opinion and the time of its adoption to the coordinator, all timestamped with r.
During phase 2 only the coordinator acts.
The coordinator waits for the first
opinions with timestamp r and selects the one which was
most recently adopted, breaking ties arbitrarily (say by privileging
1 over 0). This value, denoted by v, becomes
the coordinator proposal for round r. The coordinator
sends its (timestamped) proposal to all, including itself.
During phase 3, each process p waits till the proposal for round r is received or the failure detector signals, perhaps incorrectly, that the coordinator is down. If the proposal v is received then it is adopted by p: v(p) is set to v, t(p), the time of adoption, is set to r, and ACK-nowledgement is sent to the coordinator. Otherwise, p sends a NACK to the coordinator.
During phase 4 again only the coordinator acts.
The coordinator waits for the first
ACK's or NACK's (any mixture of the two will
do as long as there are
messages).
If these are all ACK's then the coordinator
decides on v, sends the decision value with a special flag
DECIDE to all and stops.
Whenever a process p receives a DECIDE message it will (a) echo the message to everybody, (b) decide on the same value and (c) stop.
The protocol is described more formally below.
r := 0; tp := 0 while p undecided do r := r+1
phase 1: p sends its opinion (p,r,vp,tp) to the current coordinator c phase 2: c waits for the first n/2+1 opinions of the form (q,r,vq,tq) c selects among them the value vq with largest tq (in case of ties 1 is preferred to 0) c makes a proposal, i.e., sends (c,r,vq) to all phase 3: p waits until the proposal (c,r,v) arrives or
if the proposal is received then p adopts it: vp := v tp := r p sends (r,ACK) to the current coordinator c else p sends (r,NACK) to the current coordinator c phase 4: c waits for n/2+1 messages of the form (r,ACK) or (r,NACK) if all of them are (r,ACK) then c decides: dp := v c sends (r,DECIDE,v) to all including itself whenever p receives a message of the form (s,DECIDE,u) then : p sends (s,DECIDE,u) to all p decides on u
The Non Triviality property is straightforward: only input bits ever become proposals. To show Limited Bureaucracy is left as an exercise. We prove the Agreement property.
Let A(s) be the number of ACK messages sent to the coordinator
c(s) in phase 3 of round s of the protocol.
Let vs be the value proposed by c(s) in round s.
If
we say that vs is locked.
Observe that the first process to decide on a final value must be a
coordinator, and a necessary condition for the coordinator to
do so in phase 4 of round s, is that
.
Note that this condition is not sufficient , because among the
first n/2+1 messages received by the coordinator in
phase 4 of round s there may be a NACK message.
We argue however, that once a value is locked it becomes
the only possible decision value.
Let r be the first round such that
, and
let
be the value proposed by the coordinator in phase 2 of round
r.
To show Agreement property we argue that the following invariant is
maintained by the protocol.
Invariant 12720
For every
, if the coordinator c(s) proposes a value in
phase 2 of round s, then this value is equal to
.
Before we prove the Invariant, let us see how it implies Agreement.
Observe that if a coordinator decides on a final value in
phase 4 of some round s, then it is the value proposed by it
in phase 2 of round s.
By the Invariant,
if
the proposed value is equal to
.On the other hand, if a non-coordinator decides on a final value, it is
a value on which some coordinator decided upon before (phase 4
of the protocol), hence it must be
.
Define:
Notice that in the definition of C(s) by ``receive'' we mean that the opinion of p was one of the first n/2+1 opinions top reach c(s). All others are discarded. In order to prove the Invariant by induction on s, we slightly strengthen it.
Invariant 12740
If
then the following hold in round s:
Proof. Let s=r.
By the definition of r, the value proposed by the coordinator in
round r is
, and
. Observe that only the value proposed by the coordinator in
phase 2 may be adopted by some other process in phase 3 of
the same round.
Hence for every process p, if
just after
performing phase 3 of round r, then p does not adopt a value in
round r, thus tp < r.
Let s>r.
If c(s) completes phase 2 of round s, then |C(s)| =
n/2+1 and hence by clause 2 of the induction
hypothesis
(this is the key
observation).
Let
. By the definition of V(s-1) in phase 2 of round s we have
vq = v and
, hence from clause 3
of the induction hypothesis it follows that the coordinator c(s)
selects and proposes
.
This immediately implies that
, hence
by clause 2 of the induction hypothesis
. Clause 3 of the Invariant is maintained as well,
because if tp is changed for some process p in round s, then
vp is set to
.![]()