next up previous contents
Next: Lecture 10 Up: Lecture 9 Previous: Lecture 9

Self-Stabilization

In this lecture we will consider the following general problem: A system, e.g. a computer network, is in some steady state executing a protocol- ensuring, say, fair access to a resource requiring mutual exclusion- when suddenly it crashes. After recovery, the system starts executing the same protocol but from some arbitrary illegitimate state. We want the resource allocation protocol to be such that it will converge automatically to the legal steady state in spite of such catastrophic events.

Protocols with these properties are called self-stabilizing. Self-stabilization is an alternative approach to fault-tolerance initiated by Dijkstra [15,16].


  
Figure 9.1: The ring of processors and the demon
\begin{figure}
 \psfrag{T}{$T$}
 \psfrag{B}{$B$}
 \begin{center}
 \psfrag{T}{$T$...
 ...\hspace*{-1.6in}\mbox{}
\includegraphics {daemon.ps}
}
 \end{center}\end{figure}

We will study this general problem for the ring topology, depicted in Figure 9.1. Roughly speaking, we want a steady state where there is exactly one machine designated as ``privileged'' and moreover the privilege should be passed around the ring forever in a fair manner.

We assume that a demon sits in the center of the ring and schedules in some arbitrary fashion which processor is to take a step next. The only constraint we impose on the demon is that in any infinite sequence of steps every process is scheduled infinitely often. This is equivalent to having local mutual exclusion -- it is never the case that two adjacent processors simultaneously take a step. What we ask if there exists a protocol such that, given any initial state, global mutual exclusion (i.e. only one process at the time is privileged) is ensured after finitely many steps. The protocol can therefore be viewed as a way of converting local mutual exclusion into fair, global mutual exclusion.

To make things more precise we assume that each machine is a finite automaton with three states $\{1,2,3\}$.We order these as follows:


\begin{picture}
(95,86)
 \put(10,10){\makebox(0,0){2}}
 \put(85,10){\makebox(0,0...
 ...}}
 \put(81,19){\vector(-1,2){25}}
 \put(43,67){\vector(-1,-2){25}}\end{picture}

Once a machine is scheduled by the demon to take the next step, the new state will be computed as function of the current state and the states of the two neighbouring machines. For symmetry reasons, one special machine T must be designated -- otherwise there are actions of the demon which preserve symmetry unless the number of processors is a prime. We will denote the left neighbor of T with B (T and B stand for ``Top'' and ``Bottom'' respectively.) Next we ``cut'' the ring as indicated in Figure 9.1 and transform it into a path.

Any initial assignment of states to the machine defines a set of arrows pointing from the machine with higher state to that with lower state (using the circular ordering above). A possible scenario is illustrated in the example below.


\begin{picture}
(310,35)
 \put(5,25){\makebox(0,0){$B$}}
 \put(5,10){\makebox(0,...
 ... \put(275,25){\makebox(0,0){$L$}}
 \put(305,25){\makebox(0,0){$T$}}\end{picture}

For a generic node S in the internal part of the path, we let L and R denote its left and right neighbors respectively.

We can now state our problem more precisely.

Given any initial assignment of arrows, we want to end up with one arrow which passes around the ring, from B to T and back endlessly.
The legal steady state is when there is one single arrow going back and forth. We can then designate the node to which the arrow points as the only privileged machine in the ring.

We will argue below that the following rules guarantee this:

1.
$B+1 = R \quad \Rightarrow \quad B \leftarrow B+2$
2.
$L = S+1 \,\vee\, R = S+1 \quad \Rightarrow \quad S \leftarrow S+1$
3.
$L = B \,\wedge\, T \neq B+1 \quad \Rightarrow \quad T \leftarrow B+1$
We shall call these the transition rules.

Next we analyze the results of these rules when applied to the configurations where they result in a change. We shall refer to these as the arrow rules. These rules are equivalent to the transition rules, but using different names will come handy in the proofs.

 
Figure 9.1: The ring of processors and the demon
Case Before After
0. $B \leftarrow R$ $B \rightarrow R$
1. $L \rightarrow S \hphantom{\ensuremath{\mbox{}\rightarrow\mbox{}}}R$ $L \hphantom{\ensuremath{\mbox{}\rightarrow\mbox{}}}S \rightarrow R$
2. $L \hphantom{\ensuremath{\mbox{}\rightarrow\mbox{}}}S \leftarrow R$ $L \leftarrow S \hphantom{\ensuremath{\mbox{}\rightarrow\mbox{}}}R$
3. $L \rightarrow S \leftarrow R$ $L \hphantom{\ensuremath{\mbox{}\rightarrow\mbox{}}}S \hphantom{\ensuremath{\mbox{}\rightarrow\mbox{}}}R$
4. $L \rightarrow S \rightarrow R$ $L \hphantom{\ensuremath{\mbox{}\rightarrow\mbox{}}}S \leftarrow R$
5. $L \leftarrow S \leftarrow R$ $L \rightarrow S \hphantom{\ensuremath{\mbox{}\rightarrow\mbox{}}}R$
6. $L \rightarrow T \hphantom{\ensuremath{\mbox{}\rightarrow\mbox{}}}B$ $L \leftarrow T$
7. $L \hphantom{\ensuremath{\mbox{}\rightarrow\mbox{}}}T \hphantom{\ensuremath{\mbox{}\rightarrow\mbox{}}}B$ $L \leftarrow T$

Clearly, once we only have one arrow it will move back and forth forever, ``bouncing'' at the nodes B and T. This can easily be checked through case analysis. What remains to be done is proving that the system always converges into a state where there is exactly one arrow.

We begin with some simple observations.

Fact 12802

There is always at least one feasible move.  

Proof. We proceed by case analysis. If there is any arrow pointing to a node in the interior of the path, the second transition rule (or, equivalently, one of the arrow rules 2--5) can always be applied. Suppose then that there are no such arrows. When the situation is $B \leftarrow R$, the first arrow rule can be applied. If this arrow too is absent then either there are no arrows at all or, there is a unique arrow $L \rightarrow T$. But then the third transition rule (or, equivalently, arrow rules 6 or 7) can be applied.$\Box$

Fact 12808

Between any two moves of T there is at least one B move.  

Proof. The precondition of transition rule 3 is $T \neq B+1$ and the result is $T \leftarrow B+1$. There are no other transition rules which modify T, so there must exist a move which modifies B between consecutive moves which modify T.$\Box$

The proof of the next fact uses the potential function defined by the number of arrows in the system.

Fact 12816

Any sequence in which B does not appear is finite.  

Proof. Suppose by contradiction that there is an infinite sequence of moves without B's. By Fact 9.2, there cannot be two T's in the sequence. The only rule which increases the number of arrows is arrow rule 7. But since there can be at most one T move in the whole sequence, this creates at most one arrow. Arrow rules 3-5 remove arrows, so there can only be a finite number of these. What remain to be proven is that there cannot be an infinite number of applications of arrow rules 1 and 2. This is however impossible as the arrows which are translated in these cases would disappear into B or T; there can therefore only be a finite number of applications of these rules.$\Box$

The next theorem makes use of another potential function and of a suitably defined predicate.

Definition 12822

Let A be the predicate ``The leftmost arrow exists and it points to the right''.

Definition 12826

y = l + 2r where l and r denote the number of arrows pointing to the left and to the right respectively.

Theorem 12827

There is exactly one arrow within finitely many moves.  

Proof. By the facts above, any sequence of moves is infinite and of the form

\begin{displaymath}
\cdots \mbox{}_F \underbrace{B_T \cdots T \cdots
 \mbox{}_F}...
 ...ed at\strut}}{\mbox{least once}}}
 B_T \cdots T \cdots B \cdots\end{displaymath}

where the subscripts give the value of the predicate A. Suppose A is falsified by arrow rule 6: Then there is only one arrow in the system as the leftmost and rightmost arrows coincide, and the claim follows. Suppose by contradiction that arrow rule 6 is never applied in the entire sequence. We derive a contradiction as follows. We divide the sequence into segments of moves between two consecutive B's (including the first and not including the second). By Fact 9.3 there are infinitely many such segments. We then show that after executing any such segment the potential function y decrease by at least one. But this is impossible because y's value is finite to start with.

We begin by showing that, if A is falsified by any other rule than arrow rule 6, y must decrease by at least one. To see this, we study the effects of the different cases on y.

Case $\Delta y$
0. +1
1.  
2.  
3. -3
4. -3
5.  
6. -1
7. +1

Notice that the only cases that can falsify A are numbers 3 and 4 (as we assume that arrow rule 6 is never applied). The interval between two consecutive actions by B therefore contains one B move (case 0, $\Delta y = +1$), at most one T move (case 7, $\Delta y = +1$)and at least one move belonging to case 3 or 4 ($\Delta y = -3$). The net effect is that $\Delta y \leq -1$.As y cannot decrease indefinitely, at some point the system only contains one arrow.$\Box$


next up previous contents
Next: Lecture 10 Up: Lecture 9 Previous: Lecture 9
Viggo Kann
Sat Dec 20 23:41:16 MET 1997