In this lecture we will continue our study of the maximal (as opposed to the maximum ) independent set problem (henceforth MIS). We will still consider the synchronous, message-passing model but the underlying topology will be that of any graph G=(V, E), not just the ring.
It is worth pointing out that the problem is absolutely trivial in a sequential setting, as the following linear time greedy algorithm shows: pick one vertex, put it into the independent set, remove its neighbors; pick another vertex, put it into the independent set, remove its neighbours, and so on.
In some sense we would like to parallelize this simple algorithm. Perhaps surprisingly, this turns out to be extremely hard to do. In fact, this is hard to do even in the PRAM model, where this problem was originally studied. This seemingly simple problem is so challenging that in order to solve it it was necessary to develop a new and very interesting method called derandomization, to which we return. The solution we will develop makes use of randomization; whether there exists deterministic algorithms for MIS running in poly-logarithmic many rounds is a challenging open problem.
The algorithm we discuss is known as Luby's algorithm, after his inventor . Essentially the same algorithm and analysis, however, can be found in . Here we follow the crisp presentation of .
Although Luby's algorithm was originally developed in the PRAM model, it is also a bonafide distributed algorithm. We now describe the algorithm.
The algorithm runs in stages. During each stage the algorithm computes an independent set I and removes the set . If the leftover graph is non empty it becomes the input to the next stage, otherwise the algorithm stops. The output of the algorithm is the union of all sets I. It is easy to see that this gives a MIS.
A stage of the algorithm is as follows.
The basic problem is to show that during each stage enough progress is made. In particular we want to show that the expected number of stages is . To do this, we will show that each stage removes an expected constant fraction of all the edges of the current graph. We define
and call it the set of low degree neighbours of u. The next definition is pivotal.
A vertex u is good if .
Intuitively, a good vertex has a lot of low degree neighbours and hence has a good chance of making it into .
If u is good, then
An edge e = (u,v) is good if at least one of its neighbors is good. A bad edge is an edge which is not good.
With these definitions at hand we make our plan more specific. First, we show that in any graph, at least half of the edges are good. Second, we show that the probability that a good edge is removed during a stage is a constant fraction. This means that the expected number of good edges which are removed at each stage is a constant fraction of the total and this implies that the expected number of stages is , where n is the number of vertices of the input graph. A more sophisticated analysis actually shows that the running time is with high probability [13,20].
Let EG and EB denote the sets of good and bad edges respectively. Then
Proof. We define a mapping such that
In other words, to each bad edge we can associate a pair of edges (bad or good it doesn't matter) in such a way that the pairs are all disjoint and each pair contains different edges. Such a mapping implies that |EB| is at most half of E.
The mapping is defined as follows. First, direct each edge of G toward the lower degree vertex, resolving ties arbitrarily. Focus now on a bad vertex. This vertex will have some edges pointing inward and some outward. By definition of bad vertex, at least one third of its neighbours have strictly higher degree. Therefore, there are at least twice as many edges pointing outward than edges pointing inward. This enables us to associate a pair of outward edges to every inward edge, each time using two brand new outward edges. Notice now that, again by definition, all inward edges are bad. Hence, this defines a mapping .
Does f satisfies the properties we want? By construction each f(e) is a pair of different edges. To see that for any two edges , we consider two cases. If e1 and e2 point to the same vertex, then by construction. Otherwise, if e1 and e2 are directed towards different vertices u and v, then the only edge that could be in the intersection of f(e1) and f(e2) is (u,v). But due to the fact that the edge is directed, it can be placed in either f(e1) or f(e2), but not in both.
What remains to be shown is that, in expectation, a constant fraction of good edges is removed at each stage. To prove this we will make use of the following two facts.
For any vertex u, .
Proof. We upper bound the complementary event .A vertex does not make it into I if and only if a neighbour whose degree is higher or equal to that of u enters S. Therefore,
It is convenient at this point to make a very important
Remark: Equation (11.1) still holds if the assumption of the independence of the unbiased coin flip is replaced with the much weaker one of pairwise independence. A collection of events are pairwise independent if, for all distinct events we have .
Proof. Using Fact 11.2,
Let e=(u,v) be a good edge. Then, .
Proof. Assume without loss of generality that u is good. Then, is lower bounded by which we now estimate. There are two cases to consider. The first case is: .By Fact 11.3,
The second, complementary, case is: .This case must be examined more carefully. By fact 11.1,
Since each term in this sum is smaller than 1/6, we can also find a subset A of N(u) such that
once again, notice that the proof of Lemma 11.2
hinges only on pairwise independence. Specifically, this is used in
Combining the last two lemmata shows that the expected fraction of edges which are deleted per stage is 1/72, which concludes the proof.