2018-04-17

## Introduction

Path bandwidth is the smallest edge bandwidth along the path. The problem to solve is: given a weighted graph and 2 vertices s and t, find an s-t path of the largest bandwidth. This project is implemented to solve maximum bandwidth problem of a random undirected graph with Java with OOD. Basically, it includes functions as below.

1. A graph generator in order to generate random undirected graph with two ways, degree or possibility.
2. A heap structure to be used as a data structure storing fringe node while doing Dijstra’s algorithm.
3. An optimized union and find algorithm to be used in maximum spanning tree algorithm
4. 3 routing algorithms
• Dijstra’s with no heap
• Dijstra’s with heap
• Maximum spanning tree with Kruskal’s algorithm

This report will first introduce the implementation details; then show performance with different algorithms on several random graphs. Finally, a discussion will be given to analyze the result.

The code can be found at https://github.com/melonskin/networkOptimizationAlgorithm.

## Implementation

### Random Graph generation

#### Structure of the graph

In order to represent an undirected graph, this project uses a GraphNode class for the vertex and Edge class for edge. The GraphNode contains a set of outgoing edges to connect it with its neighbors. It also has a unique ID to differ it with other nodes. The Edge class contains both ends of this edge, as well as the weight of this edge.

#### Graph generator

As recommended by the project instruction, at first, a big cycle containing all nodes is generated to ensure all vertices are connected. After that, two generating strategies are used. This generator will return a list of graph nodes as output.

1. by degree: We have $n$ vertices. We pick two ends from those vertices at random and add edge with random edge between them until we have $degree * n / 2$ edges.
2. by possibility: We go over all combinations of two vertices, at 20% possibility, we add an edge with random weight between them.

### Heap

This heap structure is used to store fringe nodes in Dijstra’s algorithm. The difference between it and a typical heap is that I may need to delete inside elements in this heap. So I made one instead of using PriorityQueue. The idea is to assign every element an ID. The heap class consists of a position array and a data array. Position array records the position of a given ID in the data array.

I initially tried to use generic type directly for the implementation. But I didn’t figure out how to get the id of a generic-type element. Finally, I decided to create a HeapNode class as the element type in heap and use wrapper for different real data type.

### Union Find

A well optimized union find algorithm is implemented for building maximum spanning tree. Basically, the class contains two arrays, parents and ranks.

1. parents stores the parent ID of a given ID.
2. ranks stores the rank of roots.

It supports two operations, find and union

1. For find(n1), it will trace back from given ID to its parent, and parent’s parent, until reaching an element without a parent, i.e. a root.
• An optimization here is attaching every elements on this tracing path to the final root, so that the height of the tree is reduced dramatically.
2. For union(n1, n2), first we find the roots of two elements. If two roots are the same, do nothing (but it can be a useful tool to check whether an edge between two nodes makes cycle in a graph). Else, based on the ranks of two roots, we attach one to another to make the tree as balanced as possible.

### Dijstra

Idea is to expand, max bandwidth path. It’s a greedy method and guaranteed to find optimal with all positive edges. This project implements two cases with Dijstra’s algorithm, one without heap and the other with heap. Since the main algorithm part of both cases are the same, a base Dijstra class containing the main algorithm is created first. After that, two specific classes with different methods adding and retrieving from the fringe implements the base class.

### Maximum spanning tree

Max spanning tree is a spanning tree whose sum of edge’s weight is the maximum. It can solve the maximum bandwidth problem. Its advantage is that once the MST is constructed, we can find the path between any two ends in linear time. This project implements this algorithm in 3 steps.

1. Offer all unique edges into a max heap
2. Do Kruskal’s algorithm to construct MST with edges stored in the heap
3. For two specific ends, do DFS in MST to find the path

## Performance

### Sample output

This section shows the sample output of 3 different path-finding algorithms. The running time, max bandwidth and the path found are displayed. We can see that the max bandwidth found by all 3 algorithms are the same.

#### Dijstra without heap

The duration of Dijstra without heap is: 53 ms.
Max bandwidth found by Dijstra without heap is 7656
2179->2180->2085->1640->1805->326->47->48->120->119->1307->75->977->976->838->2441->
1214->1846->337->702->800->258->489->2177->708->1700->595->594->1348->2365->2192->
2193->592->1716->955->1713->368->3959->3960->1618->2035


#### Dijstra with heap

The duration of Dijstra with heap is: 3 ms.
Max bandwidth found by Dijstra with heap is 7656
2179->2180->2085->1884->1123->1124->2058->2057->4505->2648->3518->4875->4529->873->
4594->3356->3128->3129->3130->2904->3162->1151->3600->3514->4481->3282->1308->999->
1077->4414->4413->1191->4035->1645->2481->2480->2479->1303->1302->1159->3524->4695->
431->4397->705->3636->1895->4223->4089->2491->498->4257->2035


#### MST

The duration for find path with MST is 3 ms.
Max bandwidth found by MST is 7656
2179->2180->2085->1640->1805->1750->4808->1210->2175->3119->4219->3059->3534->1824->
1823->1822->3373->4265->2529->2530->4098->2591->4302->1076->1075->396->2510->1045->
1505->1504->2935->425->147->4962->4961->3054->2396->2776->4003->490->701->2266->563->
261->4856->970->549->548->547->2181->3426->1748->4377->2912->1739->1738->1765->1544->
3191->4535->3598->3977->976->838->2441->1214->1846->1847->4967->4817->531->2965->
2260->4674->3945->3944->4113->4016->1827->122->123->1306->955->1713->368->3959->
3960->1618->2035


### Graph with degree of 8

The running time is shown as below for the 8-degree case for 5 different random graphs, each with 5 pairs of two randomly chosen ends.

#### Dijstra without heap

Duration, ms: Pair1 Pair2 Pair3 Pair4 Pair5
Graph1 66 12 59 84 48
Graph2 57 89 74 107 69
Graph3 35 4 102 43 92
Graph4 39 41 9 95 62
Graph5 4 29 48 54 48

#### Dijstra with heap

Duration, ms: Pair1 Pair2 Pair3 Pair4 Pair5
Graph1 19 6 12 10 1
Graph2 2 4 5 8 5
Graph3 0 6 11 0 6
Graph4 1 4 1 5 3
Graph5 5 4 4 3 3

#### MST

##### MST build time
Graph Graph1 Graph2 Graph3 Graph4 Graph5
Duration, ms: 95 77 86 39 32
##### MST find time
Duration, ms: Pair1 Pair2 Pair3 Pair4 Pair5
Graph1 9 6 5 6 6
Graph2 3 3 3 4 6
Graph3 2 4 4 4 3
Graph4 4 2 2 3 3
Graph5 3 3 4 3 3

### Graph with 20% connection probability

The running time is shown as below for the 20-percent-probability case for 5 different random graphs, each with 5 pairs of two randomly chosen ends.

#### Dijstra without heap

Duration, ms: Pair1 Pair2 Pair3 Pair4 Pair5
Graph1 313 162 526 484 114
Graph2 754 157 467 83 365
Graph3 500 629 289 556 529
Graph4 324 579 403 367 445
Graph5 311 120 265 274 141

#### Dijstra with heap

Duration, ms: Pair1 Pair2 Pair3 Pair4 Pair5
Graph1 211 71 379 39 190
Graph2 543 61 356 93 131
Graph3 360 457 220 443 346
Graph4 132 493 302 140 384
Graph5 258 61 409 341 6

#### MST

##### MST build time
Graph Graph1 Graph2 Graph3 Graph4 Graph5
Duration, ms: 23233 22521 22546 22787 21769
##### MST find time
Duration, ms: Pair1 Pair2 Pair3 Pair4 Pair5
Graph1 7 4 2 2 2
Graph2 2 2 3 2 2
Graph3 2 1 2 2 2
Graph4 2 2 2 3 2
Graph5 2 2 3 3 2

## Discussion

### Time complexity

The time complexity of different algorithms used in this project is listed as below.

Algorithm Time complexity
Dijstra without heap $O(n^2)$
Dijstra with heap $O((n+m) \log n)$
MST build $O(m \log m)$
MST find $O(m)$

### Case of degree of 8

In the case of degree of 8, number of edges $m$ is proportional to the number of vertices $n$, i.e. $m = degree * n / 2 = 4n$.

#### Dijstra’s algorithm with and without heap

Based on the theoretical time complexity of two Dijstra’s, Dijstra with heap is much faster than Dijstra without heap. It’s also be proven by our performance table. Because, insertion and deletion in heap take $O(\log n)$ time. If we don’t use heap, finding the fringe vertex with maximum bandwidth among all fringe vertices takes time $O(n)$. Therefore, the optimization with heap improves the performance dramatically.

#### MST

The time building MST is $O(m \log n)$ in this case, same as the time complexity of Dijstra with heap. We can see from the performance table that the time of building MST is relatively small. Building MST has a larger constant term, since it needs more operation to insert and get edges from the heap. Therefore, we understand the time of building MST is longer. Finding path in MST is linear in time. But it may need to travel a long path in the MST than Dijstra’s (observed during experiment). And if we are lucky, Dijstra’s with heap can find path very quickly, since it may only need to examine a few nodes and edges.

### Case with 20% probability

In this case, the number of edges $m$ is $O(n^2)$, much more than the first case. We can observe that both Dijstra’s algorithm take longer time in this case. This is because they may need to explore more nodes due to a larger number of edges. Therefore, although in the worst case, time complexity of Dijstra’s without heap is unrelated with $m$. It still takes a longer time in our random case.

#### Dijstra’s algorithm with and without heap

We can observe from the performance of two Dijstra’s are similar. The one with heap might be slightly better. Because in this case, $O(n^2)$ is not necessarily larger than $O((n+m) \log n)$. Notice that the performance of Dijstra’s in this case is worse than that in the 8-degree case. This is because we have much more edges to examine now.

#### MST

Building MST takes a much longer time than Dijstra with heap and the 8-degree MST case. Because we have much more edges and we need to insert and poll all edges in the heap now. Dijstra may find the path very quickly after examining a few vertices. However, the finding path part is the similar to the 8-degree case. Because the tree contains exactly the same number of vertices and edges. We do DFS in linear time to find the path.

### Conclusion and future work

Based on the performance table, although the time complexity of Dijstra’s algorithm and MST are similar, the practical running time can differ largely. Because theoretically, we consider the worst case. The real cases can be much random. Apparently, if we find the max bandwidth path between two specific vertices only once, Dijstra’s is better choice, especially in a complex graph. But if we would like to find the path for different pairs of vertices, MST is much better, since once the MST is constructed, paths can be found in linear time.

For further improvement, it seems unnecessary to have duplicate edges in the Kruskal’s algorithm. It will need a duplicate check and double the size of the heap although we don’t use half of the array inside the heap. So the id of edge can be organized so that we only care about the first half of edges, which don’t have duplicates. Another possible improvement is that we can use adjacent list to store the MST, currently this project uses a hash map.