2017-10-16

## Maximum bandwidth path 0928

Given a weighted graph and 2 vertices s and t, find an s-t path of the largest bandwidth.

Def: Path bandwidth is the smallest edge bandwidth along the path.

### Dijkstra

Idea is to expand, pick shortest (max bandwidth) path. It’s a greedy method and guaranteed to find optimal with all positive edges.

### Dijkstra Max Bandwidth

status: in-tree, fringe, unseen

Algorithm Dijkstra(G, s, t)

1. FOR EACH v DO status[v] = unseen
2. status[s] = in-tree
3. FOR EACH [s, v] DO

status[v] = fringe
bw[v] = weight[s, v]; (INSERT(F,v) if using heap)

4. WHILE status[t] $\ne$ in-tree DO

4.1. pick a fringe v with the largest bw[v] (v = POP(F), TOP then DELETE)
4.2. status[v] = in-tree
FOR EACH edge [v, w] DO

4.3. IF status[w] = unseen THEN

status[w] = fringe; dad[w] = v; bw[w] = MIN(bw[v], weight[v, w]) (INSERT(F, w))

4.4. ELSE IF status[w] = fringe and bw[w] < MIN(bw[v], weight[v, w]) THEN

dad[w] = v; bw[w] = MIN(bw[v], weight[v, w]) (DELETE old then INSERT new)

#### Correctness

Claim: This algorithm is correct.

Assume a better path exists. Any edges on this path is better than the smallest edge of the found path. When we try to pick this smallest edge from the fringe, we actually cannot get it. Instead, we will pick the edge on the better path. Therefore, we cannot find the found path. The assumption is invalid. This algorithm is correct.

#### Analysis

The time complexity of this algorithm is $O((n+m) \log n)$. $n$ is number of nodes. $m$ is number of edges. It can actually reduced to $O(m \log n + n)$. Because for all connected graph, $m$ cannot be smaller than $n-1$. $O(n)$ is for initialization. For partially connected graph, we won’t touch isolated nodes. So it still holds.

1. Step 1 takes $O(n)$
2. Step 2 takes $O(1)$
3. Step 3: $O(n \log n)$
4. Step 4 WHILE loop will be executed at most $n - 1$ times. Total time is $O(n \log n)$.
• 4.1 takes $O(\log n)$.
• 4.2 is $O(1)$
5. The FOR loop in step 4 will be executed $m$ times if directed; $2m$ if undirected. Total time is $O(m \log n)$.
• 4.3 takes $O(\log n)$
• 4.4 is $O(\log n)$

#### Heap

In order to use heap as the data structure, it must support DELETE action. We need an extra array to record the positions of the heap elements. Delete that element, then move the last element into that position and update position to keep the heap ordered. So it can be done in $O(\log n)$.

### Graph representation

1. adjacent matrix: If $n$ nodes, the size is $n \times n$.
• very sparse matrix
• $O(n)$ for each edge
2. adjacent list: A length $n$ list. Every element v stores a linked list [w1, w2,…] with weights, which are actually edges (v, w1), (v, w2),…
• In the worst case, still $O(n)$ for each edge.

## Max spanning tree 1004

$G$ is a graph, $T$ is a spanning tree for $G$. If $T$ meets the following requirements:

1. $T$ is a subgraph of $G$.
2. $T$ is a tree.
3. $T$ contains all vertices of $G$

Max spanning tree is a spanning tree whose sum of edge’s weight is the maximum.

### Kruskal Algorithm

Algorithm Kruskal(G)

Sort the edges of $G$ in non-increasing order: $e_{1},e_{2},...e_{m}$ (if “no-decreasing” than for min spanning tree)
FOR $i$ = 1 to $m$

IF $T \cup e_{i}$ does not make a cycle ($e_{i}=(u_{i},v_{i})$, if $u_{i},v_{i}$ do not belong to the same group of $T$) THEN

DO $T = T \cup{e_{i}}$

RETURN $T$

#### Analysis

1. Sorting the edges takes time of $O(m \log m)$.
2. The for loop takes time $O(m\log n)$ where runs $m$ times and each takes $\log n$ to check if the endpoints of the new edge belong to the same group of $T$. We can use 2-3 tree to do it. trace from two endpoints to their roots; then check if the same root.

Since $% $, $% $. Therefore, the overall time is $O(m \log n)$.

#### Correctness

##### It is a tree

Three elements for a tree: graph, all connected, no cycle. It is apparently a graph. In terms of the Kruskal algorithm, it cannot include cycles. How to prove it is connected?

Assume pieces $A$ and $B$ are not connected, but there must be edge $i$ connecting $A$ and $B$, and $i$ must be considered in this algorithm. Since it will not lead to a cycle, it will be added. So, $A$ and $B$ must be connected.

##### It is a max spanning tree

Claim: At any point, $T$ is a part of a max $ST$.

Proof by mathematical induction:

1. By induction on this number $h$ times of for loop in Kruskal algorithm.
2. The claim holds for $h=0$.
3. For $h>0$, assume the claim holds for $h-1$, The max ST is $T$.
4. For the case $h$. Assume after $h-1$ executions of for loop,
• If the edge makes a cycle, we don’t add it, the claim holds.
• If we then add an edge, which is part of $T$. The claim holds.
• Else, we add an edge $i$ which is not a part of $T$. So, at a time point after $h$, a edge $j$ which belongs to the $T$ will not be selected to avoid to form a cycle. Since this edge $j$ is after the $i$ in the $e_{1},e_{2},...e_{m}$, $i>=j$. So, what we get finally is a spanning tree $T - j + i$ with the sum of weights not smaller than the $T$. According to the definition of max $ST$, the spanning tree $T - j + i$ is still a max spanning tree. Therefore, the claim holds at $h$.

## Maximum bandwidth path with MST 1010

Given a weighted graph $G$ and vertices $s$ and $t$, find on $s-t$ path of the max bandwidth. if the graph is undirected, how to find it?

Idea is:

1. construct a max spanning tree $T$ for $G$
2. find the $s-t$ path on $T$
3. return this path

### Correctness

Theorem 1: If $T$ is the the max $ST$ contains the $s-t$ path, $s-t$ is a max bandwidth path in the graph $G$.

#### Proof

1. Assume a path $P$ is the max bandwidth path
2. Pick any edge $e$ in $P$ that is not in $T$.
3. $e$ and some of $T$ form a cycle $C$. (Since a max $ST$ contains all vertices, so there must be a cycle)
4. Claim the bandwidth of $e$ is not larger than any other edges in the cycle. Otherwise, $e$ will be a part of the max spanning tree (thinner edge will be removed). $T$ will not be a max $ST$.
5. With this claim, we replace the edge $e$ by the edges in $T$ that forms a cycle with $e$. The result is still a max-bandwidth path from $s-t$ that has fewer edges not in $T$ (Compared to $P$). Repeat this process, you will get max-bandwidth path that is entirely in $T$.

### Analysis

Compared with Dijkstra algorithm

1. It takes time only $O(m)$, with MST constructed.
• Do DFS with the adjacent list
2. If including MST construction, MST $O(m \log n)$ is generally 4 times faster than Dijkstra
3. Dijkstra is single-source. MST can trace from any source.

## Max bandwidth path divide and conquer 1012

### Algorithm

max-bandwidth path$(G,s,t)$

Remove the smaller half of edges
IF $s$ and $k$ are in the same piece $G_{L}$
THEN recursively work on $G_{L},s,t$
ELSE

Shrink each pieces $P$ into a single vertex $v_{p}$ in the original graph $G$. Connect two pieces using the possible largest edge in the smaller half(How?) crossing them. Get the new graph be $G_{S}$.
Recursively work on $(G_{S},v_{S},v_{T})$

#### smaller half finding

To find the smaller half, there is a $O(n)$ algorithm.

k-th smallest finding$(S, k)$

Do it directly when $\vert S \vert \le 20$; RETURN

1. partition $S$ into groups of 5 (a group with 5 elements)
2. FOR EACH group, find the group medians $x_i$
3. let the set of group medians be $S_{n/5}$, It contains $x_{1}, x_{2}, ..., x_{n/5}$
4. Recursive call on $S_{n/5}$ to find the median $m^{*}$ of $S_{n/5}$
5. split $S$ by $m^{*}$ into $S_{\le m^{*}}$ and $S_{> m^{*}}$
6. IF $% $
THEN recursively process $S_{ \le m^{*}}$ and $k$
ELSE recursively process $S_{ > m^{*}}$ and $k^{'}=k- \vert S_{ \le m^{*}} \vert$
##### Analysis

Assume The algorithm takes $T(n)$.

1. Step 1 takes $O(n)$
2. Step 2 takes $O(n)$. For each group, 9 comparisons at most.
3. Step 3 is trivial
4. Step 4 takes $T(\frac{n}{5})$
5. Step 5 is $O(n)$
6. Step 6 is $\le T(\frac{7n}{10})$
• $\vert S_{ \le m^{*}} \vert$ and $\vert S_{ > m^{*}} \vert$ satisfy $\frac{3n}{10} \le \vert S_* \vert \le \frac{7n}{10}$

Therefore, we have:

Guess $T(n) \le 10 C n$

The guess is valid. So $T(n) = O(n)$

## Reference

This is my class note of CSCE 629 in Texas A&M taught by Dr. Jianer Chen. I also get help from Yun He’s note on MST part. Credit to them.