next up previous contents
Next: Lecture 9 Up: Lecture 8 Previous: Lecture 8

Solving consensus using $\Diamond W$

In this lecture we show how to use the $\Diamond W$ failure detector to obtain a protocol for consensus in message-passing asynchronous systems which tolerates up to f < n/2 crash failures. It was left as an exercise to show how to attain strong completeness using failure detectors satisfying only weak completeness . Therefore it suffices to provide a protocol using $\Diamond S$ failure detectors.

The idea is as follows. The protocol runs in asynchronous rounds, where in round r, with the exception of a special DECIDE message, only messages timestamped with r are sent and processed. Throughout the execution of the protocol each process p has an opinion consisting of a bit $v(p) \in \{0,1\}$.The initial opinion is given by the input bit. Each opinion has a time of adoption denoted by t(p) whose initial value is 0. Both opinion and time of adoption can change during the execution of the protocol. During each round there is a (unique) coordinator which is simply the process with ID number equal to $(r \mod n) + 1$ (n denotes the number of processes). Notice that the coordinator rotates among the processes. Notice also that for this to work the processes must agree beforehand on a consistent numbering among them. A round r consists of four different phases.

During phase 1, each process, including the coordinator, sends his opinion and the time of its adoption to the coordinator, all timestamped with r.

During phase 2 only the coordinator acts. The coordinator waits for the first $\lfloor \frac n2 +1 \rfloor$opinions with timestamp r and selects the one which was most recently adopted, breaking ties arbitrarily (say by privileging 1 over 0). This value, denoted by v, becomes the coordinator proposal for round r. The coordinator sends its (timestamped) proposal to all, including itself.

During phase 3, each process p waits till the proposal for round r is received or the failure detector signals, perhaps incorrectly, that the coordinator is down. If the proposal v is received then it is adopted by p: v(p) is set to v, t(p), the time of adoption, is set to r, and ACK-nowledgement is sent to the coordinator. Otherwise, p sends a NACK to the coordinator.

During phase 4 again only the coordinator acts. The coordinator waits for the first $\lfloor \frac n2 +1 \rfloor$ACK's or NACK's (any mixture of the two will do as long as there are $\lfloor \frac n2 +1 \rfloor$ messages). If these are all ACK's then the coordinator decides on v, sends the decision value with a special flag DECIDE to all and stops.

Whenever a process p receives a DECIDE message it will (a) echo the message to everybody, (b) decide on the same value and (c) stop.

The protocol is described more formally below.


$v_p := \mbox{input bit}$ r := 0; tp := 0   while p undecided do r := r+1 $c := (r \bmod n) + 1$   phase 1: p sends its opinion (p,r,vp,tp) to the current coordinator c   phase 2: c waits for the first n/2+1 opinions of the form (q,r,vq,tq) c selects among them the value vq with largest tq (in case of ties 1 is preferred to 0) c makes a proposal, i.e., sends (c,r,vq) to all   phase 3: p waits until the proposal (c,r,v) arrives or $c \in D_p$ if the proposal is received then p adopts it: vp := v tp := r p sends (r,ACK) to the current coordinator c else p sends (r,NACK) to the current coordinator c   phase 4: c waits for n/2+1 messages of the form (r,ACK) or (r,NACK) if all of them are (r,ACK) then c decides: dp := v c sends (r,DECIDE,v) to all including itself   whenever p receives a message of the form (s,DECIDE,u) then : p sends (s,DECIDE,u) to all p decides on u

The Non Triviality property is straightforward: only input bits ever become proposals. To show Limited Bureaucracy is left as an exercise. We prove the Agreement property.

Let A(s) be the number of ACK messages sent to the coordinator c(s) in phase 3 of round s of the protocol. Let vs be the value proposed by c(s) in round s. If $A(s)\ge n/2+1$ we say that vs is locked. Observe that the first process to decide on a final value must be a coordinator, and a necessary condition for the coordinator to do so in phase 4 of round s, is that $A(s) \geq n/2 + 1$. Note that this condition is not sufficient , because among the first n/2+1 messages received by the coordinator in phase 4 of round s there may be a NACK message. We argue however, that once a value is locked it becomes the only possible decision value.

Let r be the first round such that $A(r) \geq n/2 + 1$, and let $\hat{v}$ be the value proposed by the coordinator in phase 2 of round r. To show Agreement property we argue that the following invariant is maintained by the protocol.

Invariant 12720

  For every $s \geq r$, if the coordinator c(s) proposes a value in phase 2 of round s, then this value is equal to $\hat{v}$.

Before we prove the Invariant, let us see how it implies Agreement. Observe that if a coordinator decides on a final value in phase 4 of some round s, then it is the value proposed by it in phase 2 of round s. By the Invariant, if $s \geq r$ the proposed value is equal to $\hat{v}$.On the other hand, if a non-coordinator decides on a final value, it is a value on which some coordinator decided upon before (phase 4 of the protocol), hence it must be $\hat{v}$.

Define:

Notice that in the definition of C(s) by ``receive'' we mean that the opinion of p was one of the first n/2+1 opinions top reach c(s). All others are discarded. In order to prove the Invariant by induction on s, we slightly strengthen it.

Invariant 12740

If $s \geq r$ then the following hold in round s:

1.
if the coordinator c(s) completes phase 2, then it proposes $\hat{v}$,
2.
$\vert V(s)\vert \geq n/2 + 1$,
3.
for every process p, if $v_p \not= v$, then tp < r.

Proof. Let s=r. By the definition of r, the value proposed by the coordinator in round r is $\hat{v}$, and $\vert V(r)\vert = A(r) \geq n/2+1$. Observe that only the value proposed by the coordinator in phase 2 may be adopted by some other process in phase 3 of the same round. Hence for every process p, if $v_p \not= v$ just after performing phase 3 of round r, then p does not adopt a value in round r, thus tp < r.

Let s>r. If c(s) completes phase 2 of round s, then |C(s)| = n/2+1 and hence by clause 2 of the induction hypothesis $V(s-1) \cap C(s) \not= \emptyset$ (this is the key observation). Let $q \in V(s-1) \cap C(s)$. By the definition of V(s-1) in phase 2 of round s we have vq = v and $t_q \geq r$, hence from clause 3 of the induction hypothesis it follows that the coordinator c(s) selects and proposes $\hat{v}$. This immediately implies that $V(s) \supseteq V(s-1)$, hence by clause 2 of the induction hypothesis $\vert V(s)\vert \geq
 \vert V(s-1)\vert \geq n/2 + 1$. Clause 3 of the Invariant is maintained as well, because if tp is changed for some process p in round s, then vp is set to $\hat{v}$.$\Box$


next up previous contents
Next: Lecture 9 Up: Lecture 8 Previous: Lecture 8
Viggo Kann
Sat Dec 20 23:41:16 MET 1997