Introduction
Path bandwidth is the smallest edge bandwidth along the path. The problem to solve is: given a weighted graph and 2 vertices s and t, find an s-t path of the largest bandwidth. This project is implemented to solve maximum bandwidth problem of a random undirected graph with Java with OOD. Basically, it includes functions as below.
- A graph generator in order to generate random undirected graph with two ways, degree or possibility.
- A heap structure to be used as a data structure storing fringe node while doing Dijstra’s algorithm.
- An optimized union and find algorithm to be used in maximum spanning tree algorithm
- 3 routing algorithms
- Dijstra’s with no heap
- Dijstra’s with heap
- Maximum spanning tree with Kruskal’s algorithm
This report will first introduce the implementation details; then show performance with different algorithms on several random graphs. Finally, a discussion will be given to analyze the result.
The code can be found at https://github.com/melonskin/networkOptimizationAlgorithm.
Implementation
Random Graph generation
Structure of the graph
In order to represent an undirected graph, this project uses a GraphNode
class for the vertex and Edge
class for edge. The GraphNode
contains a set of outgoing edges to connect it with its neighbors. It also has a unique ID to differ it with other nodes. The Edge
class contains both ends of this edge, as well as the weight of this edge.
Graph generator
As recommended by the project instruction, at first, a big cycle containing all nodes is generated to ensure all vertices are connected. After that, two generating strategies are used. This generator will return a list of graph nodes as output.
- by degree: We have vertices. We pick two ends from those vertices at random and add edge with random edge between them until we have edges.
- by possibility: We go over all combinations of two vertices, at 20% possibility, we add an edge with random weight between them.
Heap
This heap structure is used to store fringe nodes in Dijstra’s algorithm. The difference between it and a typical heap is that I may need to delete inside elements in this heap. So I made one instead of using PriorityQueue
. The idea is to assign every element an ID. The heap class consists of a position array and a data array. Position array records the position of a given ID in the data array.
I initially tried to use generic type directly for the implementation. But I didn’t figure out how to get the id of a generic-type element. Finally, I decided to create a HeapNode class as the element type in heap and use wrapper for different real data type.
Union Find
A well optimized union find algorithm is implemented for building maximum spanning tree. Basically, the class contains two arrays, parents and ranks.
parents
stores the parent ID of a given ID.ranks
stores the rank of roots.
It supports two operations, find and union
- For
find(n1)
, it will trace back from given ID to its parent, and parent’s parent, until reaching an element without a parent, i.e. a root.- An optimization here is attaching every elements on this tracing path to the final root, so that the height of the tree is reduced dramatically.
- For
union(n1, n2)
, first we find the roots of two elements. If two roots are the same, do nothing (but it can be a useful tool to check whether an edge between two nodes makes cycle in a graph). Else, based on the ranks of two roots, we attach one to another to make the tree as balanced as possible.
Dijstra
Idea is to expand, max bandwidth path. It’s a greedy method and guaranteed to find optimal with all positive edges. This project implements two cases with Dijstra’s algorithm, one without heap and the other with heap. Since the main algorithm part of both cases are the same, a base Dijstra class containing the main algorithm is created first. After that, two specific classes with different methods adding and retrieving from the fringe implements the base class.
Maximum spanning tree
Max spanning tree is a spanning tree whose sum of edge’s weight is the maximum. It can solve the maximum bandwidth problem. Its advantage is that once the MST is constructed, we can find the path between any two ends in linear time. This project implements this algorithm in 3 steps.
- Offer all unique edges into a max heap
- Do Kruskal’s algorithm to construct MST with edges stored in the heap
- For two specific ends, do DFS in MST to find the path
Performance
Sample output
This section shows the sample output of 3 different path-finding algorithms. The running time, max bandwidth and the path found are displayed. We can see that the max bandwidth found by all 3 algorithms are the same.
Dijstra without heap
The duration of Dijstra without heap is: 53 ms.
Max bandwidth found by Dijstra without heap is 7656
2179->2180->2085->1640->1805->326->47->48->120->119->1307->75->977->976->838->2441->
1214->1846->337->702->800->258->489->2177->708->1700->595->594->1348->2365->2192->
2193->592->1716->955->1713->368->3959->3960->1618->2035
Dijstra with heap
The duration of Dijstra with heap is: 3 ms.
Max bandwidth found by Dijstra with heap is 7656
2179->2180->2085->1884->1123->1124->2058->2057->4505->2648->3518->4875->4529->873->
4594->3356->3128->3129->3130->2904->3162->1151->3600->3514->4481->3282->1308->999->
1077->4414->4413->1191->4035->1645->2481->2480->2479->1303->1302->1159->3524->4695->
431->4397->705->3636->1895->4223->4089->2491->498->4257->2035
MST
The duration for find path with MST is 3 ms.
Max bandwidth found by MST is 7656
2179->2180->2085->1640->1805->1750->4808->1210->2175->3119->4219->3059->3534->1824->
1823->1822->3373->4265->2529->2530->4098->2591->4302->1076->1075->396->2510->1045->
1505->1504->2935->425->147->4962->4961->3054->2396->2776->4003->490->701->2266->563->
261->4856->970->549->548->547->2181->3426->1748->4377->2912->1739->1738->1765->1544->
3191->4535->3598->3977->976->838->2441->1214->1846->1847->4967->4817->531->2965->
2260->4674->3945->3944->4113->4016->1827->122->123->1306->955->1713->368->3959->
3960->1618->2035
Graph with degree of 8
The running time is shown as below for the 8-degree case for 5 different random graphs, each with 5 pairs of two randomly chosen ends.
Dijstra without heap
Duration, ms: | Pair1 | Pair2 | Pair3 | Pair4 | Pair5 |
---|---|---|---|---|---|
Graph1 | 66 | 12 | 59 | 84 | 48 |
Graph2 | 57 | 89 | 74 | 107 | 69 |
Graph3 | 35 | 4 | 102 | 43 | 92 |
Graph4 | 39 | 41 | 9 | 95 | 62 |
Graph5 | 4 | 29 | 48 | 54 | 48 |
Dijstra with heap
Duration, ms: | Pair1 | Pair2 | Pair3 | Pair4 | Pair5 |
---|---|---|---|---|---|
Graph1 | 19 | 6 | 12 | 10 | 1 |
Graph2 | 2 | 4 | 5 | 8 | 5 |
Graph3 | 0 | 6 | 11 | 0 | 6 |
Graph4 | 1 | 4 | 1 | 5 | 3 |
Graph5 | 5 | 4 | 4 | 3 | 3 |
MST
MST build time
Graph | Graph1 | Graph2 | Graph3 | Graph4 | Graph5 |
---|---|---|---|---|---|
Duration, ms: | 95 | 77 | 86 | 39 | 32 |
MST find time
Duration, ms: | Pair1 | Pair2 | Pair3 | Pair4 | Pair5 |
---|---|---|---|---|---|
Graph1 | 9 | 6 | 5 | 6 | 6 |
Graph2 | 3 | 3 | 3 | 4 | 6 |
Graph3 | 2 | 4 | 4 | 4 | 3 |
Graph4 | 4 | 2 | 2 | 3 | 3 |
Graph5 | 3 | 3 | 4 | 3 | 3 |
Graph with 20% connection probability
The running time is shown as below for the 20-percent-probability case for 5 different random graphs, each with 5 pairs of two randomly chosen ends.
Dijstra without heap
Duration, ms: | Pair1 | Pair2 | Pair3 | Pair4 | Pair5 |
---|---|---|---|---|---|
Graph1 | 313 | 162 | 526 | 484 | 114 |
Graph2 | 754 | 157 | 467 | 83 | 365 |
Graph3 | 500 | 629 | 289 | 556 | 529 |
Graph4 | 324 | 579 | 403 | 367 | 445 |
Graph5 | 311 | 120 | 265 | 274 | 141 |
Dijstra with heap
Duration, ms: | Pair1 | Pair2 | Pair3 | Pair4 | Pair5 |
---|---|---|---|---|---|
Graph1 | 211 | 71 | 379 | 39 | 190 |
Graph2 | 543 | 61 | 356 | 93 | 131 |
Graph3 | 360 | 457 | 220 | 443 | 346 |
Graph4 | 132 | 493 | 302 | 140 | 384 |
Graph5 | 258 | 61 | 409 | 341 | 6 |
MST
MST build time
Graph | Graph1 | Graph2 | Graph3 | Graph4 | Graph5 |
---|---|---|---|---|---|
Duration, ms: | 23233 | 22521 | 22546 | 22787 | 21769 |
MST find time
Duration, ms: | Pair1 | Pair2 | Pair3 | Pair4 | Pair5 |
---|---|---|---|---|---|
Graph1 | 7 | 4 | 2 | 2 | 2 |
Graph2 | 2 | 2 | 3 | 2 | 2 |
Graph3 | 2 | 1 | 2 | 2 | 2 |
Graph4 | 2 | 2 | 2 | 3 | 2 |
Graph5 | 2 | 2 | 3 | 3 | 2 |
Discussion
Time complexity
The time complexity of different algorithms used in this project is listed as below.
Algorithm | Time complexity |
---|---|
Dijstra without heap | |
Dijstra with heap | |
MST build | |
MST find |
Case of degree of 8
In the case of degree of 8, number of edges is proportional to the number of vertices , i.e. .
Dijstra’s algorithm with and without heap
Based on the theoretical time complexity of two Dijstra’s, Dijstra with heap is much faster than Dijstra without heap. It’s also be proven by our performance table. Because, insertion and deletion in heap take time. If we don’t use heap, finding the fringe vertex with maximum bandwidth among all fringe vertices takes time . Therefore, the optimization with heap improves the performance dramatically.
MST
The time building MST is in this case, same as the time complexity of Dijstra with heap. We can see from the performance table that the time of building MST is relatively small. Building MST has a larger constant term, since it needs more operation to insert and get edges from the heap. Therefore, we understand the time of building MST is longer. Finding path in MST is linear in time. But it may need to travel a long path in the MST than Dijstra’s (observed during experiment). And if we are lucky, Dijstra’s with heap can find path very quickly, since it may only need to examine a few nodes and edges.
Case with 20% probability
In this case, the number of edges is , much more than the first case. We can observe that both Dijstra’s algorithm take longer time in this case. This is because they may need to explore more nodes due to a larger number of edges. Therefore, although in the worst case, time complexity of Dijstra’s without heap is unrelated with . It still takes a longer time in our random case.
Dijstra’s algorithm with and without heap
We can observe from the performance of two Dijstra’s are similar. The one with heap might be slightly better. Because in this case, is not necessarily larger than . Notice that the performance of Dijstra’s in this case is worse than that in the 8-degree case. This is because we have much more edges to examine now.
MST
Building MST takes a much longer time than Dijstra with heap and the 8-degree MST case. Because we have much more edges and we need to insert and poll all edges in the heap now. Dijstra may find the path very quickly after examining a few vertices. However, the finding path part is the similar to the 8-degree case. Because the tree contains exactly the same number of vertices and edges. We do DFS in linear time to find the path.
Conclusion and future work
Based on the performance table, although the time complexity of Dijstra’s algorithm and MST are similar, the practical running time can differ largely. Because theoretically, we consider the worst case. The real cases can be much random. Apparently, if we find the max bandwidth path between two specific vertices only once, Dijstra’s is better choice, especially in a complex graph. But if we would like to find the path for different pairs of vertices, MST is much better, since once the MST is constructed, paths can be found in linear time.
For further improvement, it seems unnecessary to have duplicate edges in the Kruskal’s algorithm. It will need a duplicate check and double the size of the heap although we don’t use half of the array inside the heap. So the id of edge can be organized so that we only care about the first half of edges, which don’t have duplicates. Another possible improvement is that we can use adjacent list to store the MST, currently this project uses a hash map.