merge sort
 divide array into two halves
 recursively sort each half
 merge two halves
mergesort
time: at most \(N \log N\) compares and \(6N \lg N\) array accesses to sort any array of size \(N\).
memory: use extra space proportional to N
merge
Goal: given two sorted subarrays a[lo]
to a[mid]
and a[mid+1]
to a[hi]
, replace with sorted subarray a[lo]
to a[hi]
Copy to auxiliary array aux[]
, use index i
and j
for each half, use index k
for the output a[]
;
compare aux[]
of each half with the minimum index, then copy the smaller one to a[]
;
increase both the corresponding index of the smaller one and k
by one.
code for merge
// assertion
assert isSorted(a,lo,mid);
assert isSorted(a, mid+1, hi);
for (int k = lo; k<=hi; k++)
aux[k] = a[k];
int i = lo, j = mid + 1;
for (int k = lo; k <= hi, k++)
{
if (i > mid) a[k] = aux[j++];
else if (j > hi) a[k] = aux[i++];
else if (less(aux[j], aux[i]) a[k] = aux[j++];
else a[k] = aux[i++];
}
assert isSorted(a, lo, hi);
code for mergesort
create aux array outside the merge class
if (hi <= lo) return;
int mid = lo + (hi  lo)/2;
sort(a, aux, lo, mid);
sort(a, aux, mid+1, hi);
merge(a, aux, lo, mid, hi);
assertion
java ea prog // enable assertion
java da prog // disable assertion
improvement

Use insertion sort for small subarrays.
 mergesort has too much overhead for tiny subarrays
 cutoff to insertion sort for about 7 items.
if (hi <= lo +cutoff 1) { Insertion.sort(a, lo, hi); return; }
 Stop if already sorted
 if the biggest in the first half is less than or equal to the smallest in second half.
 Helps for partiallyordered arrays
if (!less(a[mid+1], a[mid]) return;
 eliminate the copy to the auxiliary array (save time but not space)
bottomup mergesort
no recursion
 pass through array, merging subarrays of size 1
 repeat for subarray of size 2, 4, 8, 18, …
int N = a.length;
aux = new Comparable[N];
for (int sz = 1; sz < N; sz = sz + sz)
for (int lo = 0; lo < N  sz; lo += sz+sz)
merge(a, lo, lo+sz1, Math.min(lo+sz+sz1, N1__;
sorting complexity
Computational complexity: to study efficiency of algorithms for solving a particular problem X.
Model of computation: allowable operations
Cost model: Operation count(s)
Upper bound: Cost guarantee provided by some algorithms for X
Lower bound: Proven limit on cost guarantee of all algorithms for X
Optimal algorithm: Algorithm with best possible cost guarantee for X
Example: sorting
Model of computation: decision tree
Cost model: # of compares
Upper bound: \(\sim N \lg N\) from mergesort
Lower bound: \(N \lg N\)
Optimal algorithm: mergesort (but not for space)
Any comparebased sorting algorithm must use at least \(\lg(N!) \sim N \lg N\) compares in the worst case.
 worst case dictated by height h of decision tree.
 binary tree of height h has at most \(2^h\) leaves
 \(N!\) different orderings \(\rightarrow\) at least \(N!\) leaves
\(2^h \ge \text{# leaves} \ge N! \)
Lower bound may not hold if the algorithm has some information:
 initial order of the input
 distribution of key values
 representation of the keys
comparators
Can sort using an alternate order or many different order
Must be a total order
Arrays.sort(a, String.CASE_INSENSITIVE_ORDER);
public static void sort(Object[] a, Comparator comparator)
{
int N = a.length;
for (int i = 0; i<N; i++)
for (int j = i; j > 0 && less(comparator, a[j], a[j1]); j)
exch(a,j,j1);
private static boolean less(Comparator c, Object v, Object w)
{ return c.compare(v,w) < 0; }
private static void exch(Object[] a, int i, int j)
{ Object swap = a[i]; a[i]=a[j]; a[j] = swap; }
}
implement
public static final Comparator<Student> BY_NAME = new ByName();
private static class ByName implements Comparator<Student>
{
public int compare(Student v, Student w)
{ return v.name.compareTo(w.name); }
}
// use
Arrays.sort(a, Student.BY_NAME);
stability
Insertion sort and mergesort are stable (but not selection sort or shellsort)
For Insertion sort, equal items never move pass each other.
Selection sort is not stable since it have longdistance exchange might move an item pass some equal item.
The original order of equal items may be changed so unstable.
quicksort
quicksort
 shuffle the array
 partition so that, for some j
 entry
a[j]
is in place  no larger entry to the left of j
 no smaller entry to the right of j
 entry
 sort each piece recursively.
Phase 1: repeat until i and j pointers cross.
 scan i from the left to right so long as (
a[i]<a[lo]
)  scan j from the right to left so long as (
a[j]>a[lo]
)  exchange
a[i]
witha[j]
Phase 2: When pointer cross
 exchange
a[lo]
witha[j]
private static int partition(Comparable[] a, int lo, int hi)
{
int i = lo, j = hi + 1;
while (true)
{
while (i < hi && less(a[++i], a[lo]);
while (j > lo && less(a[lo], a[j]);
if (i >= j) break;
exch(a, i, j);
}
// phase 2
exch(a, lo, j);
return j;
}
// quicksort, recursively
public class Quick
{
private static int partition(Comparable[] a, int lo, int hi)
{...}
public static void sort(Comparable[] a)
{
StdRandom.shuffle(a);
sort(a, 0, a.length1);
}
private static void sort(Comparable[] a, int lo, int hi)
{
if (hi <= lo + CUTOFF  1)
{
Insertion.sort(a, lo, hi);
return;
}
int j = partition(a, lo, hi);
sort(a, lo, j1);
sort(a, j+1, hi);
}
details
 partition inplace: does not use extra space
 terminating the loop: testing whether the pointers cross is a bit trickier than it might seem.
 staying in bounds. The (
j == lo
) test is redundant but the (i == hi
) is not.  preserving randomness: shuffling is needed for performance guarantee.
 Equal keys: when duplicates are presents, it is better to stop on keys equal to the partitioning item’s key.
best case: \(\sim N \log N\)
worst case: \(\sim 0.5 N^2\) (if already sorted, so shuffling is needed.)
The average number of compares to quicksort an array of N distinct keys is \(\sim 2N \ln N\) and the number of exchanges is \(\sim \frac{1}{3}N \lg N\)
average case:
 number of compare is \(\sim 1.39 N \lg N\)
 39% more compares than mergesort
 but faster than mergesort because of less data movement
properties
 partitioning: constant extra space
 depth of recursion: logarithmic extra space
 not stable
improvements
 insertion sort small subarrays (around 10)
 best choice of pivot item is the median
// median
int m = medianOf3(a, lo, lo + (hi  lo)/2, hi);
swap(a, lo, m);
selection
Goal: Given an array of N items, find a \(k^{th}\) smallest item.
Upper bound: \(N \log N\)
Lower bound: \(N\)
quickselect
 entry
a[j]
is in place  no larger entry to the left of j
 no smaller entry to the right of j
StdRandom.shuffle(a);
int lo = 0; hi = a.length  1;
while (hi > lo)
{
int j = partition(a, lo, hi);
if (j < k) lo = j + 1;
else if (j > k) hi = j  1;
else return a[k];
}
return a[k];
Quickselect takes linear time on average.
There is a comparebased selection algorithm whose worstcase running time is linear. But constants are too high \(\rightarrow\) not used in practice.
duplicate keys
Often, purpose of sort is to bring items with equal keys together.
 sort population by age
 find collinear points
 remove duplicates from mailing list
 sort job applicants
mergesort with duplicate keys: between \(\frac{1}{2}N \lg N\) and \(N\lg N\) compares.
quicksort with duplicate keys: algorithm goes quadratic unless partitioning stops on equal keys.
Recommend: Stop scans on items equal to the partitioning item. \(\sim N \lg N\) compares when all keys equal.
Desirable: Put all items equal to the partitioning item in place.
3way partitioning (Dijkstra)
 entries between
lt
andgt
equal to partition itemv
.  no larger entries to left of
lt
 no smaller entries to the right of
lt
public static void sort(Comparable[] a, int lo, int hi)
{
if hi <= lo return;
int lt = lo; gt = hi;
Comparable v = a[lo];
int i = lo;
while (i <= gt)
{
int cmp = a[i].compareTo(v);
if cmp > 0 exch(a, i, gt);
else if cmp <0 exch(a, i++, lt++);
else i++;
}
sort(a, lo, lt  1);
sort(a, gt + 1, hi);
}
Quicksort with 3way partitioning is entropy optimal.
system sorts
Arrays.sort()
 has different method for each primitive type
 has a method for data types that implement
Comparable
 has a method that uses a
Comparator
 use tuned quicksort for primitive types (performance is key factor); tuned mergesort for objects
import java.util.Arrays;
Arrays.sort(a)