if left[0]
arr[ix] = left.pop(0)
else:
arr[ix] = right.pop(0)
The Mergesort Subroutine:
The sorting happens in merge! It is interesting to note that no actual sorting occurs in the main mergesort function. Instead, the sorting occurs in the merge subroutine. To see this, consider what happens when we execute mergesort([5, 2]). The function splits this array into left = [5] and right = [2] and makes recursive calls to mergesort([5]) and mergesort([2]). These recursive calls exit without doing anything, since their inputs are arrays of size one. Hence mergesort calls merge([5], [2], [5, 2]). The merge function is given two “stacks of cards”, each with one element. It will take the smaller of the two cards and place it into the first slot in arr, then it will place the other card into the second slot – this is where the elements are sorted! Before calling merge, arr is [5, 2]. After calling merge, arr is [2, 5], and in sorted order.
Solving Basic Recurrences by Unrolling Them
More precisely, we can solve a recurrence by following the below procedure
1. Unroll the recurrence until a pattern emerges. We have already done this above. Let’s agree to call the “first step” of unrolling the point in time before we have unrolled the recurrence. Then at the first step we have the original recurrence: T(n) = T(n − 1) + c. The “second step” of unrolling is then the point in time after unrolling once. At the second step we found: T(n) = T(n −2) + 2c. At the third step (after unrolling twice), we had: T(n) = T(n − 3) + 3c. We could continue, but by now a pattern has emerged.
Placing these results into a table can make it easier to identify the pattern: Step Formula
1 T(n − 1) + c
2 T(n − 2) + 2c
3 T(n − 3) + 3c
2. Find a general formula for T(n) in the kth step of unrolling. The table above makes it clear that on the kth step of unrolling we will obtain the formula T(n) = T(n − k) + k · c. You can check your general formula by plugging in a particular value of k and verifying that the result matches what you have in the table above. For instance, taking k = 3 yields T(n) = T(n − 3) + 3c, which is indeed what is in the table.
3. Calculate the step number in which the base case is reached. The argument to T decreases with each step until it eventually reaches zero, which is the base case of this recurrence. On what step does it reach zero, exactly? On the kth step, the argument is n − k. We therefore solve n − k = 0 for k, resulting in k = n. Hence the base case is reached on the nth step 4. Replace k with this step number in the general formula for T(n). We found above that, on the kth step, T(n) = T(n − k) + k · c. Replacing k by n (the number of steps needed to reach the base case) we arrive at a new formula: T(n) = T(n − n) + n · c = T(0) + n · c We know from Equation 2.1 that T(0) = c0. So: = c0 + n · c At this point, we’ve removed all T’s from the right hand side of the equation. We have therefore solved the recurrence: its solution is T(n) = c0 + n · c. You can verify that this formula will produce the same result as using Equation 2.1 to recursively calculate T(n), for all integer values of n ≥ 0. Therefore, the time it takes factorial to run on an input of size n is T(n) = c0 + c · n = Θ(n). That is, it takes linear time.
Another Example:
Solving (MergeSort) Recurrence Relations
You can actually solve the recurrence relation given above. We'll sketch how to do that here. We'll write n instead of O(n) in the first line below because it makes the algebra much simpler.
T(n) = 2 T(n/2) + n
= 2 [2 T(n/4) + n/2] + n
= 4 T(n/4) + 2n
= 4 [2 T(n/8) + n/4] + 2n
= 8 T(n/8) + 3n
= 16 T(n/16) + 4n
= 2k T(n/2k) + k n [this is the Eureka! line]
We know that T(1) = 1 and this is a way to end the derivation above. In particular we want T(1) to appear on the right hand side of the = sign. This means we want:
n/2k = 1 OR n = 2k OR log2 n = k
Continuing with the previous derivation we get the following since k = log2 n:
= 2k T(n/2k) + k n
= 2log2 n T(1) + (log2n) n
= n + n log2 n [remember that T(1) = 1]
= O(n log n)
So we've solved the recurrence relation and its solution is what we "knew" it would be. To make this a formal proof you would need to use induction to show that O(n log n) is the solution to the given recurrence relation, but the "plug and chug" method shown above shows how to derive the solution --- the subsequent verification that this is the solution is something that can be left to a more advanced algorithms class.
Identifying the Recurrence That Represents the Run Time of an Algorithm
Binary Search
Using the assumption that the input is sorted to design a more efficient algorithm (Like Homework 03, problems 5,6,7)
Basic Graph Theory
A graph like Facebook whose edges have no direction is called an undirected graph. A graph like Twitter whose edges have a direction is called a directed graph.
Definition 4. A directed graph (or digraph) G is a pair (V, E), where V is a finite set of objects (called the nodes or vertices) and E is a set of ordered pairs of objects in V (called the edges).
Definition 5. A undirected graph G is a pair (V, E), where V is a finite set of objects
(called nodes) and E is a set of unordered pairs of objects in V (called edges).
In a directed graph, the edge (u, v) is said to leave u and enter v. In an undirected graph, the edge (u, v) is said to be incident upon both u and v.
In an undirected graph, the set of neighbors of a node u consists of all of the nodes which share an edge with u
The in-degree of a node u in a directed graph is the number of edges which enter u, and the out-degree of u is the number of edges which leave u.
The degree (or total degree) of a node in a directed graph is the sum of the node’s in-degree and the node’s out-degree.
Definition 6. Let G = (V, E) be a graph (either directed or undirected), and let u, u ′ ∈ V be nodes. A path of length k from u to u ′ is a sequence (v0, v1, . . . , vk−1 , vk) of k + 1 nodes in V (with k ≥ 0) such that v0 = u, vk = u ′ , and for each i ∈ {1, . . . , k}, (vi−1, vi) ∈ E.
Second, we have defined the length of a path to be one less than the number of nodes contained within.
, we define a simple path to be a path in which each node along the path is unique.
We say that a node v is reachable from a node u if there is a path from u to v.
Definition 7. Let G = (V, E) be an undirected graph. A connected component of G is a set C ⊂ V such that 1. any pair of nodes u, u ′ ∈ C are reachable from one another; and 2. if u ∈ C and v ̸∈ C then u and v are not reachable from one another.
Definition 8. Let G = (V, E) be an undirected graph. A strongly connected component of G is a set C ⊂ V such that 1. any pair of nodes u, u ′ ∈ C are mutually-reachable from one another; and 2. if u ∈ C and v ̸∈ C then u and v are not mutually-reachable from one another. For example, consider the following directed graph: ab > c There is only one weakly connected component of the above graph: {a, b, c}. However, there are two strongly connected components: {a, b} and {c}.
Adjacency M vs Adjacency List vs Dict-of-sets Then an adjacency matrix is a |V| × |V| matrix A whose (i, j) entry Aij is one if and only if edge (i, j) exists, and otherwise zero.
Nevertheless, if the graph is dense – that is, it has close to the greatest number of edges possible – then adjacency matrices are not so wasteful.
. For instance, querying whether or not an edge (u, v) exists takes constant time if the graph is represented using an adjacency matrix, since we need only return arr[u,j]. With adjacency lists, on the other hand, it can take Θ(V) time in the worst case, since we must search for v in a list of all of u’s neighbors, which can potentially contain all of the nodes in the graph.
We can implement a fast graph representation using a dictionary of sets. The approach is very similar to that of the adjacency list representation, instead we replace the outer list with a dict and the inner lists with sets. The principal advantage of using a dict for the outer container is that it allows us to label nodes as something other than numbers 0-(v-1). Using this representation, querying for the existence of an edge is a Θ(1) operation on average, just as with adjacency matrices. To be precise, to determine whether the edge (u, v) is in the graph, we first find u in the outer dict, which takes constant time on average. This returns the set of u’s neighbors; we can check whether v is in this set in constant average time.
The below table shows these time complexities for undirected graphs – those for directed graphs are similar:
Adding a node Θ(1)
Removing a node with m neighbors Θ(m)
Adding an edge Θ(1)
Removing an edge Θ(1)
Querying for a node Θ(1)
Querying for an edge Θ(1)
Retrieving number of neighbors Θ(1)
Iterating over m neighbors Θ(m)
Counting the amount of edges in a Graph
The sum of the vertex degree values is twice the number of edges, because each of the edges has been counted from both ends.
BFS AND DFS
Breadth-first search (including the computation of distances and predecessors)
def full_bfs(graph):
status = {node: 'undiscovered' for node in graph.nodes}
for node in graph.nodes:
if status[node] == 'undiscovered' bfs(graph, node, status)
from collections import deque
def bfs(graph, source, status=None):
"""Start a BFS at `source`."""
if status is None:
status = {node: 'undiscovered' for node in graph.nodes}
status[source] = 'pending'
pending = deque([source]) # will use as a queue
# while there are still pending nodes
while pending:
u = pending.popleft() # pop from left (front of queue)
for v in graph.neighbors(u):
# explore edge (u,v)
if status[v] == 'undiscovered':
status[v] = 'pending'
pending.append(v) # append to right (back of queue)
status[u] = 'visited'
- Depth-first search (including the computation of start and finish times and predecessors)
def full_dfs(graph):
status = {node: 'undiscovered' for node in graph.nodes}
for node in graph.nodes:
if status[node] == 'undiscovered'
dfs(graph, node, status)
def dfs(graph, u, status=None):
"""Start a DFS at `u`."""
# initialize status if it was not passed
if status is None:
status = {node: 'undiscovered' for node in graph.nodes}
status[u] = 'pending'
for v in graph.neighbors(u): # explore edge (u, v)
if status[v] == 'undiscovered':
dfs(graph, v, status)
status[u] = 'visited'
TOPOLOGICAL SORT
We wish to find a topological sort of its nodes in order to create a schedule of classes in which each course is taken after all of its prerequisites have been satisfied. In the language of graph theory, node u should appear before all nodes which are reachable from u in the prerequisite graph. Intuitively, whether or not a node v is reachable from u is related to the finish times of both nodes. The following claim makes this precise: Claim 7. Suppose v is reachable from u(̸= v) in the directed graph G. Then the finish time of v is less than the finish time of u.- Understanding of how the time complexity of BFS and DFS is determined
. Before the search begins, full_bfs initializes the status of each node in Θ(V) time. It then loops over every node in the graph, checking whether it is 'undiscovered' or not. These two lines are executed Θ(V) times, taking Θ(1) time per execution for a total of Θ(V) time. Therefore, the inner loop iterates exactly |E| times in total if the graph is directed, and exactly 2|E| times in total if the graph is undirected. On
each iteration, the loop spends constant time checking the status of nodes and appending
them to the queue, if need be. As a result, the inner for-loop takes Θ(E) time in total over
the course of the entire search. Since initializing the node statuses takes Θ(V) time, and exploring each edge takes Θ(E) time, the time complexity of the algorithm as a whole is Θ(V + E). If we know that one of |V| or |E| is larger than the other, we can simplify this expression. For instance, if the graph is fully connected (meaning, it has every possible edge), then |E| = Θ(|V| 2 ), and the run-time simplifies to Θ(|V| 2 ). On the other hand, if the graph has no edges, the algorithm must still perform initialization (as well as add each node to the queue and pop each node off), taking time Θ(|V|). Lastly, since the time complexity of the “full” BFS is an upper-bound on the time complexity of bfs, we have that the time complexity of the latter is O(V + E).
First, note that the initialization of status in full_dfs will take Θ(V) time, as will iterating over the nodes of the graph and checking whether each is undiscovered. What remains is to determine the time taken in aggregate by all calls to dfs. If the graph is directed, this edge is explored exactly once: when dfs is called on node u. On the other hand, if the graph is undirected the edge is explored exactly twice: when dfs is called on node u, and again when it is called on node v. As a result, each edge is explored Θ(1) times, and the total number of executions of Line 22 is Θ(E). In sum, the total time taken by full_dfs is therefore Θ(V + E) – the same as breadth-first search. Since the time complexity of full_dfs upper-bounds that of dfs, the time complexity of the latter is O(V + E).
- Minimum Spanning Trees (just their definition, basic proofs concerning them)
Definition 12. Let G = (V, E) be a connected, undirected graph (weighted or unweighted). A spanning tree is a tree T = (V, ET) which contains all of the nodes in G, and whose edge set ET is a subset of E. ppear in the graph, G. Intuitively, a minimum spanning tree of a weighted graph is a spanning tree whose total edge weight is minimized.
Definition 13. Let G = (V, E, ω) be a connected, weighted, undirected graph. Let T be the set of spanning trees of G. For any spanning tree T ∈ T , define the total edge weight ω(T) to be:
ω(T) = ∑ (u,v) ∈T ω((u, v)).
A minimum spanning tree is a spanning tree T ∗ ∈ T such that for any spanning tree T ∈ T , ω(T ∗ ) ≤ ω(T).
- Single Linkage Clustering (ability to run the algorithm on a given data set)
- The definitions of the three "hard" problems mentioned in the final lecture.Hence a graph has an Eulerian cycle if and only if: it is connected; each node has even degree.
We can determine if a graph has an Eulerian cycle in polynomial time.
No polynomial algorithm is known which can determine if a general graph has a Hamiltonian cycle.Hamiltonian cycles
Is there a cycle which visits each node exactly once (except starting node)?
No polynomial algorithm is known which can determine if a general graph has a Hamiltonian cycle.Eulerian cycle: polynomial algorithm, “easy”.
Hamiltonian cycle: no polynomial algorithm known, “hard”.Example: long paths
Input: Graph G, source u, sink v, integer k. Problem: is there a simple path between u,v of length ≥ k?
Naïve solution: try all |V|! path candidates. Polynomial solution: Unknown.
But can verify a candidate solution in polynomial time.Relationship to Hamiltonian problem
Can use algorithm for long paths to solve Hamiltonian problem.
To find a Hamiltonian cycle starting (WLOG) at u
1. For each neighbor v of u:
1.1 Create graph G ′ by copying G, deleting (u, v)
1.2 If there is a simple path of length ≥ |V| − 1 from u to v in G ′ , then there is a Hamiltonian cycle.
If algorithm for simple path takes polynomial time, so does this meta-algorithm.Example: 3-SAT
Suppose x_1, . . . , x_n are boolean variables (True,False)
A 3-clause is a combination made by or-ing and possibly negating three variables:
x_1 or x_5 or (not x_7) ||||| (not x_1) or (not x_2) or (not x_4)
Given: m clauses over n boolean variables. Problem: Is there an assignment of x_1, . . . , x_n which makes all clauses true simultaneously?
No polynomial time algorithm is known.
But it is easy to verify a solution, given a hint.Commonalities
All problems (both easy and hard) were decision problems: framed as having yes/no answer. ▶ Hard problems (Hamiltonian, long path, 3-SAT) were easy to verify.
We define two sets of decision problems: P: can be solved in polynomial time. NP: can be verified in polynomial time.
Eulerian cycle and short path are ∈ P.
But they’re also in NP! P ⊂ NP.
Hamiltonian, long path, and 3-SAT are ∈ NP.Cook’s Theorem Suppose there is a polynomial algorithm for 3-SAT. Then there is a polynomial time algorithm for any problem in NP.NP-Completeness
We say that a problem is NP-complete if: it is in NP; and a polynomial algorithm for it could be used to solve every other problem in NP in polytime.
Hamiltonian, long path, and 3-SAT are all NP-complete.
NP-complete problems are the “hardest” in NP.