Algorithms - CS3401 - Notes - Unit 2 - Graph Algorithms
Algorithms - CS3401 - Notes - Unit 2 - Graph Algorithms
GRAPH ALGORITHMS
Basic concepts
Definition
A graph G(V, E) is a non-linear data structure that consists of
node and edge pairs of objects connected by links.
There are 2 types of graphs:
Directed
Undirected
Directed graph
A graph with only directed edges is said to be a directed
graph. Example
The following directed graph has 5 vertices and 8 edges. This graph
G can be defined as G = (V, E), where V = {A,B,C,D,E} and
E = {(A,B), (A,C) (B, E), (B,D), (D, A), (D, E),(C,D),(D,D)}.
Directed Graph
Undirected graph
A graph with only undirected edges is said to be an undirected
graph.
Example
The following is an undirected graph.
Undirected Graph
REPRESENTATION OF GRAPHS
Graph data structure is represented using the following
representations.
1. Adjacency Matrix
2. Adjacency List
1.Adjacency Matrix
In this representation, the graph can be represented using a matrix
of size n x n, where n is the number of vertices.
This matrix is filled with either 1’s or 0’s.
Here, 1 represents that there is an edge from row vertex to column
vertex, and 0 represents that there is no edge from row vertex to
column vertex.
Adjacency List
GRAPH TRAVERSALS
Graph traversal is a technique used to search for a vertex in a
graph. It is also used to decide the order of vertices to be visited
in the search process.
A graph traversal finds the edges to be used in the search process
without creating loops. This means that, with graph traversal, we
can visit all the vertices of the graph without getting into a
looping path. There are two graph traversal techniques:
1. DFS (Depth First Search)
2. BFS (Breadth-First Search)
Applications of graphs
1. Social network graphs: To tweet or not to tweet. Graphs that
represent who knows whom, who communicates with whom, who
influences whom, or other relationships in social structures. An
example is the twitter graph of who follows whom.
2. Graphs in epidemiology: Vertices represent individuals and
directed edges to view the transfer of an infectious disease from
one individual to another. Analyzing such graphs has become an
important component in understanding and controlling the spread
of diseases.
3. Protein-protein interactions graphs: Vertices represent proteins
and edges represent interactions between them that carry out some
biological function in the cell. These graphs can be used to, for
example, study molecular pathway—chains of molecular
interactions in a cellular process.
4. Network packet traffic graphs: Vertices are IP (Internet
protocol) addresses and edges are the packets that flow between
them. Such graphs are used for analyzing network security,
studying the spread of worms, and tracking criminal or non-
criminal activity.
5. Neural networks: Vertices represent neurons and edges are the
synapses between them. Neural networks are used to understand how
our brain works and how connections change when we learn. The
human brain has about 1011 neurons and close to 1015 synapses.
5
We choose B, mark it as visited
and put onto the stack. Here B does
not have any unvisited adjacent
node. So, we pop B from the stack.
6
We check the stack top for return
to the previous node and check if
it has any unvisited nodes. Here,
we
find D to be on the top of the stack.
7
Only unvisited adjacent node is
from D is C now. So we visit C,
mark it as visited and put it onto
the stack.
Pseudocode (DFS)
DFS(G, u):
4
Next, the unvisited adjacent node from
S is B. We mark it as visited and
enqueue it.
5
Next, the unvisited adjacent node from
S is C. We mark it as visited and
enqueue it.
6
Now, S is left with no unvisited
adjacent nodes. So, we dequeue
and find A.
7
From A we have D as unvisited
adjacent node. We mark it as
visited and enqueue it.
BFS pseudocode
BFS(G, start):
Create an empty queue Q
mark start as visited
enqueue start into Q
while Q is not empty:
if v is not visited:
mark v as visited
enqueue v into Q // Enqueue the unvisited neighbor v into the queue
BFS Algorithm Complexity
Time complexity - O(V + E), where V is the number of nodes and E is the number of edges.
Space complexity - O(V).
BFS Algorithm Applications
1. To build index by search index
2. For GPS navigation
3. Path finding algorithms
4. In Ford-Fulkerson algorithm to find maximum flow in a network
5. Cycle detection in an undirected graph
6. In minimum spanning tree
We can say that a graph G is a bi-connected graph if it is connected, and there are no
articulation points or cut vertex are present in the graph.
To solve this problem, we will use the DFS traversal. Using DFS, we will try to find if
there is any articulation point is present or not. We also check whether all vertices are
visited by the DFS or not, if not we can say that the graph is not connected.
Pseudocode for Bi connectivity
isArticulation(start, visited, disc, low, parent)
Begin
time := 0 //the value of time will not be initialized for next function
calls dfsChild := 0
mark start as visited
set disc[start] := time+1 and low[start] := time + 1
time := time + 1
for all vertex v in the graph G, do
if there is an edge between (start, v),
then if v is visited, then
increase dfsChild
parent[v] := start
if isArticulation(v, visited, disc, low, parent) is true, then
return ture
low[start] := minimum of low[start] and
low[v] if parent[start] is φ AND dfsChild >
1, then
return true
if parent[start] is φ AND low[v] >= disc[start], then
return true
else if v is not the parent of start, then
low[start] := minimum of low[start] and
disc[v]
done return
false
End
Biconnected(graph)
Begin
initially set all vertices are unvisited and parent of each vertices are
φ if isArticulation(0, visited, disc, low, parent) = true, then
return false
for each node i of the graph,
do if i is not visited, then
return
false done
return true
End
PRIM’S ALGORITHM
Prim's algorithm is a minimum spanning tree algorithm that takes a graph as input and
finds the subset of the edges of that graph which
form a tree that includes every vertex
has the minimum sum of weights among all the trees that can be formed from the graph
How Prim's algorithm works
It falls under a class of algorithms called greedy algorithms that find the local optimum
in the hopes of finding a global optimum.
We start from one vertex and keep adding edges with the lowest weight until we reach our
goal. The steps for implementing Prim's algorithm are as follows:
1. Initialize the minimum spanning tree with a vertex chosen at random.
2. Find all the edges that connect the tree to new vertices, find the minimum and add it
to the tree
3. Keep repeating step 2 until we get a minimum spanning tree
Example of Prim's algorithm
Choose a vertex
Choose the nearest edge not yet in the solution, if there are multiple choices, choose one at random
U = { 1 } // Start with an arbitrary node (node 1) in the set U (which represents the nodes in the
MST)
let (u, v) be the lowest cost edge such that u ∈ U and v ∈ V - U // Find the lowest cost edge
while (U ≠ V): // Continue until all nodes are included in the MST
U = U ∪ {v} // Add node v to the set U (the node is now part of the MST)
KRUSKAL ALGORITHM
Kruskal's algorithm is a minimum spanning tree algorithm that takes a graph as input
and finds the subset of the edges of that graph which
form a tree that includes every vertex
has the minimum sum of weights among all the trees that can be formed from the graph
How Kruskal's algorithm works
It falls under a class of algorithms called greedy algorithms that find the local optimum in the
hopes of finding a global optimum.
We start from the edges with the lowest weight and keep adding edges until we reach our goal.
The steps for implementing Kruskal's algorithm are as follows:
1. Sort all the edges from low weight to high
2. Take the edge with the lowest weight and add it to the spanning tree. If adding the
edge created a cycle, then reject this edge.
3. Keep adding edges until we reach all vertices.
Choose the edge with the least weight, if there are more than 1, choose anyone
Choose the next shortest edge that doesn't create a cycle and add it
For each edge (u, v) ∈ G.E ordered by increasing order of weight(u, v):
// Step 2: Sort edges by weight
DIJKSTRA ALGORITHM
Dijkstra's algorithm allows us to find the shortest path between any two vertices of a
graph. It differs from the minimum spanning tree because the shortest distance
between two vertices might not include all the vertices of the graph.
How Dijkstra's Algorithm works
Dijkstra's Algorithm works on the basis that any subpath B -> D of the shortest path A
-> D between vertices A and D is also the shortest path between vertices B and D.
Choose a starting vertex and assign infinity path values to all other devices
After each iteration, we pick the unvisited vertex with the least path length. So we choose 5 before
Notice how the rightmost vertex has its path length updated twice
Initial graph
Follow the steps below to find the shortest path between all the pairs of vertices.
1. Create a matrix A0 of dimension n*n where n is the number of vertices. The row and the
column are indexed as i and j respectively. i and j are the vertices of the graph.Each cell A[i]
[j] is filled with the distance from the ith vertex to the jth vertex. If there is no path from ith
vertex to jth vertex, the cell is left as infinity.
Fill each cell with the distance between ith and jth vertex
2. Now, create a matrix A1 using matrix A0. The elements in the first column and the first row
are left as they are. The remaining cells are filled in the following way.
Let k be the intermediate vertex in the shortest path from source to destination. In this step, k is
the first vertex. A[i][j] is filled with
(A[i][k] + A[k][j]) if (A[i][j] > A[i][k] + A[k][j]).
That is, if the direct distance from the source to the destination is greater than the path
through In this step, k is vertex 1. We calculate the distance from source vertex to destination vertex
through this vertex k.
Calculate the distance from the source vertex to destination vertex through this
vertex k
For example: For A1[2, 4], the direct distance from vertex 2 to 4 is 4 and the sum of the
distance from vertex 2 to 4 through vertex (ie. from vertex 2 to 1 and from vertex 1 to 4) is 7.
Since 4 < 7, A0[2, 4] is filled with 4.
3. Similarly, A2 is created using A1. The elements in the second column and the second row are
left as they are.
In this step, k is the second vertex (i.e. vertex 2). The remaining steps are the same as in step2.
Calculate the distance from the source vertex to destination vertex through this vertex 2
4. Similarly, A3 and A4 is also created.
Calculate the distance from the source vertex to destination verte through this
vertex
Calculate the distance from the source vertex to destination vertex through this vertex 4
5. A4 gives the shortest path between each pair of vertices.
Floyd-Warshall Algorithm Pseudocode:
function FloydWarshall(n, A):
// Step 1: Initialize the distance matrix A
// A[i, j] represents the shortest distance from vertex i to vertex j
for k = 1 to n:
for i = 1 to n:
for j = 1 to n:
// Step 2: Update the distance matrix A using dynamic programming
Ak[i, j] = min(Ak-1[i, j], Ak-1[i, k] + Ak-1[k, j])
return A
Time Complexity
There are three loops. Each loop has constant complexities. So, the time complexity of the Floyd-
Warshall algorithm is O(n3).
NETWORK FLOW
Flow Network is a directed graph that is used for modeling material Flow. There are two
different vertices; one is a source which produces material at some steady rate, and another
one is sink which consumes the content at the same constant speed. The flow of the material
at any mark in the system is the rate at which the element moves.
Some real-life problems like the flow of liquids through pipes, the current through wires and
delivery of goods can be modelled using flow networks.
1. For each edge (u, v) ∈ E, we associate a nonnegative weight capacity c (u, v) ≥ 0.If
Definition: A Flow Network is a directed graph G = (V, E) such that
Let G = (V, E) be a flow network. Let s be the source of the network, and let t be the sink. A flow
in G is a real-valued function f: V x V→R such that the following properties hold:
Ford-Fulkerson Algorithm
Initially, the flow of value is 0. Find some augmenting Path p and increase flow f on each edge of p
by residual Capacity cf (p). When no augmenting path exists, flow f is a maximum flow.
FORD-FULKERSON METHOD (G, s, t)
1. Initialize flow f to 0
2. while there exists an augmenting path p
3. do argument flow f along p
4. Return f
2. do f [u, v] ← 0
3. f [u, v] ← 0
4. while there exists a path p from s to t in the residual network Gf.
5. do cf (p)←min?{ Cf (u,v):(u,v)is on p}
6. for each edge (u, v) in p
7. do f [u, v] ← f [u, v] + cf (p)
8. f [u, v] ←-f[u,v]
Example: Each Directed Edge is labeled with capacity. Use the Ford-Fulkerson algorithm to find the
maximum flow.
Solution: The left side of each part shows the residual network Gf with a shaded augmenting
path p,and the right side of each part shows the net flow f.
MAXIMUM BIPARTITE MATCHING
The bipartite matching is a set of edges in a graph is chosen in such a way, that no two edges
in that set will share an endpoint. The maximum matching is matching the maximum number
of edges.
When the maximum match is found, we cannot add another edge. If one edge is added to the
maximum matched graph, it is no longer a matching. For a bipartite graph, there can be more
than one maximum matching is possible.
Algorithm
bipartiteMatch(u, visited, assign)
Input: Starting node, visited list to keep track, assign the list to assign node with another node.
Output − Returns true when a matching for vertex u is possible.
Begin
for all vertex v, which are adjacent with u,
do if v is not visited, then
mark v as visited
if v is not assigned, or bipartiteMatch(assign[v], visited, assign) is true, then
assign[v] := u
return true
done
return
false End
maxMatch(graph)
Input − The given
graph.
Output − The maximum number of the match.
Begin
initially no vertex is assigned
count := 0
for all applicant u in M, do
make all node as unvisited
if bipartiteMatch(u, visited, assign),
then increase count by 1
done
End