topic	data structures and algorithms
data types	stack, queue, bag, union-find, priority queue
sorting	quicksort, mergesort, heapsort
searching	BST, red-black BST, hash table
graphs	BFS, DFS, Prim, Kruskal,DijKstra
strings	radix sorts, tries, KMP, regexps, data compression
advanced	B-tree, suffix array, maxflow

union-find

union: connect two objects
find/connected query: is there a path connecting the two objects

public class QuickFindUF {
    private int[] id;
    public QuickFindUF(int N) {
        id = new int[N];
        for (int i = 0, i < N, i++) 
            id[i] = i;
    }
    public boolean connected(int p, int q)
    {
        return (id[p] == id[q])
    }
    public void union (int p, int q)
    {
        pid = id[p];
        qid = id[q];
        for (int i = 0, i < N, i++) 
            if (id[i] == pid) id[i] = qid;
    }
}

if performing N union operation on N objects, the time is \(N^2\), which is too slow

union too expensive (N array accesses)
trees are flat, but too expensive to keep them flat

quick union

public class QuickFindUF {
    private int[] id;
    public QuickFindUF(int N) {
        id = new int[N];
        for (int i = 0, i < N, i++) 
            id[i] = i;
    }
    
    private int root(int i)
    {
        while (i != id[i]) i = id[i];
        return i;
    }
    
    public boolean connected(int p, int q)
    {
        return (root[p] == root[q])
    }
    public void union (int p, int q)
    {
        int i = root(p);
        int j = root(q);
        id[i] = j;
    }
}

quick union is also too slow.

trees can get tall
find too expensive ( could be N array accesses)

quick union improvement (weighted)

idea:

modify quick-uniton to avoid tall trees
keep track of size of each tree
balance by linking root of smaller tree to root of larger tree

public class QuickFindUF {
    private int[] id;
    public QuickFindUF(int N) {
        id = new int[N];
        sz = new int[N];
        for (int i = 0, i < N, i++) 
            id[i] = i;
            sz[i] = 1;
    }
    
    private int root(int i)
    {
        while (i != id[i]) i = id[i];
        return i;
    }
    
    public boolean connected(int p, int q)
    {
        return (root[p] == root[q])
    }
    public void union (int p, int q)
    {
        int i = root(p);
        int j = root(q);
        if (i == j) return;
        if (sz[i] < sz[j]) {id[i] = j; sz[j] += sz[i];}
        else               {id[j] = i; sz[i] += sz[j];}
    }
}

depth of any node x is at most \(log_2N\)：
increase by 1 when tree T1 containing x is merged into another tree T2

size of the tree containing x at least doubles since size(T2)>size(T1)
size of the tree can double at most \(\log_2N\) \(\lg N\) times

compare

algorithm	initialize	union	connected
quick-find	N	N	1
quick-union	N	N*	N
weighted(QU)	N	lgN*	lgN

*: includes cost of finding roots

improvement 2: path compression

two-pass implementation: add second loop to root() to set the id[ ] of each examined node to the root

simpler one-pass variant: make every other node in path point to its grandparent(thereby halving path length)

private int root(int i)
{
    while (i !=id[i])
    {
        id[i] = id[id[i]];
        i = id[i];
    }
    return i;
}

M union-find operation on a set of N objects

algorithm	worst-case time
quick find	M N
quick union	M N
weighted QU	N+MlogN
QU+pathcomp	N+MlogN
weig +pathc	N+Mlg*N

application

percolation

create 2 virtual sites (and connections to top and bottom)
connected if virtual top site is connected to the virtual bottom site

threshold: 0.592746

analysis of algorithms

plot the running time \(T\) vs size of the input \(N\).
The slope for log-log plot is \(s\), so \(T\) is propotional to \(N^s\)
\(T=aN^b\); \(a\) and \(b\) can be estimated by changing size \(N\)

mathematical model

total running time: sum of cost times frequency for all operations
pick an operation that is most expensive or often to compute the running time instead of going over all details

tilde notation

definition: \(f(N) \sim g(N)\) means \(\lim_{N\rightarrow \infty} \frac{f(N)}{g(N)}=1\)
ignore the lower order terms
e.g.: \(1/6N^3 +20N+16 \sim 1/6N^3\)

order-of-growth classifications

order: 1, \(\log N\),\(N\), \(N \log N\), \(N^2\), \(N^3\), \(2^N\)

better to have linear and linearithemic scales

order	description	example
1	statement	add 2 numbers
logN	divide in half	binary search
N	loop	find the maximum
N log N	divide and conquer	mergesort
\(N^2\)	double loop	check all pairs

analysis of algorithms

estimate the upper bound \(O(N^n)\) and the lower bound \(\Omega(N^n)\) of the running time \(\Theta(N^n)\) of an algorithm

memory

memory usage by bytes

Type	bytes
boolean	1
byte	1
char	2
int	4
float	4
long	8
double	8
char[]	2N+24
int[]	4N+24
double[]	8N+24

AI 2

Algorithm 17

Amazon 1

Authorization 1

Blog 3

Bootstrap 1

C++ 1

CCpp 5

CSS 2

Cloud 3

Code 1

Crawler 1

DNS 1

Database 17

DeepLearning 1

Design 17

Development 1

Docker 1

English 1

Express 1

GDB 1

Go 3

Google 4

HTML 3

IOS 1

Java 17

Javascript 4

Jekyll 1

Linux 4

MacOS 2

MachineLearning 17

Markdown 4

Mobile 1

MongoDB 2

Multi-threading 3

NAS 1

Network 11

NeuralNetwork 10

Node 1

OS 8

Public-speaking 1

Python 15

RESTful 1

Rails 9

React 1

Redis 1

Ruby 6

Shell 2

Spring 2

System 17

TCP 1

TDD 1

Thread 2

Vim 1

awk 1

git 1

jQuery 1

media 1

network 1

php 1

Princeton Algorithms P1W1 Union and Find

union-find

quick union

quick union improvement (weighted)

improvement 2: path compression

application

percolation

analysis of algorithms

mathematical model

tilde notation

order-of-growth classifications

analysis of algorithms

memory