Greedy algorithms

Example: Counting money

Suppose you want to count out a certain amount of money, using the fewest possible bills and coins
A greedy algorithm would do this would be:
At each step, take the largest possible bill or coin that does not overshoot
Example: To make $6.39, you can choose:


Doesn't always work.

In some (fictional) monetary system, “krons” come in 1 kron, 7 kron, and 10 kron coins
Using a greedy algorithm to count out 15 krons, you would get


Spaning Trees

Suppose you own a telecom company and you're looking to lay cable in a new neighborhood. ...

A spanning tree T of a connected graph G=(V,E)G = (V,E) is a
subgraph of GG that is:

The minimum spanning tree (MST) problem:
Suppose that we have a connected, undirected graph G=(V,E)G=(V,E), with a numerical weighting w(u,v)w(u,v) for each edge (u,v)(u,v).
Problem: Find an acyclic subset TET \subseteq E that connects all of the vertices in VV, and minimizes:
Σ{w(u,v)(u,v)T}\Sigma \{w(u,v) | (u,v) \in T\}
Solution: We will look for an algorithm of the form:
ETE_T \leftarrow empty set of edges
while (ETE_T is not a spanning tree)
- add an edge to ETE_T
At each stage we will ensure that ETE_T is a subset of a MST


If ee is a minimum-weight edge in a connected
graph, then ee must be an edge in at least one
minimum spanning tree

Must ee be an edge in all minimum
spanning trees of the graph?

If every edge in a connected graph GG has a
distinct weight, then GG must have exactly one
minimum spanning tree?

Proof
Assume the contrary, that there are two different MSTs AA and BB.


Prim's Algorithm

Idea: Construct a MST by greedily expanding an initial tree, composed of just a single vertex of the graph's vertices. To expand the tree, attach the nearest vertex, i.e. add the vertex connected by the lowest weight edge amongst the neighbors of the current expanding tree such that we don't create a cycle.

Pseudocode:
Input: A weighted connected graph G=(V,E)G = (V,E)
Output: ErE_r the set of edges composing a minimum spanning tree of G.

Prim(G):
Vrv0V_r \leftarrow {v_0}
ErE_r \leftarrow \varnothing
for i1i \leftarrow 1 to V1|V|- 1 do:
- find a minimum-weight edge e=(v,u)e^* = (v^*, u^*) among all the edges (v,u)(v, u) such that vv is in VrV_r and uu is in VVrV - V_r
- VrVr{u}V_r \leftarrow V_r \cup \{u^*\}
- ErEr{e}E_r \leftarrow E_r \cup \{e^*\}
return ErE_r

Let's try two examples


MSTexample.png

MSTexample2.png


A bit more detail:

The input is the graph G=(V, E), and a root r \in V.
for each v in V
  key[v] ← ∞;
  parent[v] ← null;
key[r]0;
add all vertices in V to the queue Q.
while (Q is nonempty) {
  u ← extractMin(Q);
  for each vertex v that is adjacent to u {
    if v ∈ Q and weight(u,v) < key[v] {
      parent[v] ← u;
      key[v]weight(u,v);
    }
  }
}

Ideas: Store all vertices that are not in (VT,ETV_T, E_T) in a priority queue Q with an extractMin operation.
If u is a vertex in Q, what’s key[u] (the value that determines u’s position in Q)?
- key[u] = minimum weight of edge from u into VTV_T
- If no such edge exists, key[u] = ∞.
- We maintain information about the parent (in (VT,ETV_T, E_T)) of each vertex v in an array parent[].
- ETE_T is kept implicitly as {(v, parent[v]) | v ∈ V−Q−{r}}.

Assuming a binary heap:


Kruskal's Algorithm

Idea: create a forest F (a set of trees), where each vertex in the graph is a separate tree
create a set S containing all edges in the graph
while S is not empty ad F is not yet spanning:
-remove and edge with minimum weight from S
-if the removed edge connects to different trees, then add it to F, combining two trees into a single tree
At the termination of the algorithm, the forest forms a minimum spanning tree of the graph.

Pseudocode using the union find data structure:

Kruskal(G):
  A = null
  for each v in V:
    MakeSet(v)
  for each (u, v) in E, sorted by weight from smallest to largest:
    if FindSet(u) != FindSet(v):
      A = A union {(u, v)}
      Union(u, v)
  return A

Union Find (disjoint set) data structure
The operations we need are:

Complexity:
Sorting the edges costs O(E log E)
The union find data structure can perform O(V) operations in O(V log V) time.

A sophisticated union find data structure can run in O(α(V))\mathcal{O}(\alpha (V)) time.
Combined with a linear time sorting algorithm, we could achieve O(Eα(V))\mathcal{O}(E \alpha(V)) run time.


Shortest path - Dijkstra’s Algorithm

Dijkstra’s finds the lowest cost path from source node and a destination node in a weighted graph.

Input: A source node s and destination node d, Graph G =
(V, E) with edge weights {W[e] | e in E}
Output: the path weight from s to d

for i <- V.length
  dist[i] <- infinity
  previous[i] <- nil
dist[s] <- 0

H <- minHeap(V) // using the values in dist as keys
while H is not empty
  u <- deleteMin(H)
  for all edges (u, v) in E
    if dist[v] > dist[u] + E[u,v]
      dist[v] <- dist[u] + E[u,v]
      previous[v] <- u
      update H

Other algorithms you should consider studying:

Proababilistic/Random Algorithms:
Random pivot selection for Quicksort
Bloom Filter - space-efficient probabilistic data structure
Skip List

Hashing
Cuckoo hashing
Universal hashing