DS (2 Files Merged)
DS (2 Files Merged)
PG-DAC SEPT-2021
SEPT-2021
ALGORITHMS
ALGORITHMS &
& DATA
DATA STRUCTURES
STRUCTURES
SACHIN
SACHIN G.
G. PAWAR
PAWAR
SUNBEAM
SUNBEAM INSTITUTE
INSTITUTE OF
OF
INFORMATION
INFORMATION &
& TECHNOLOGIES
TECHNOLOGIES
PUNE
PUNE && KARAD
KARAD
Sunbeam Infotech www.sunbeaminfo.com
2. Non-Linear / Advanced data structures : data elements gets stored / arranged into the memory
in a non-linear manner (e.g. hierarchical manner) and hence can be accessed non-linearly.
- Tree (Hierarchical manner)
- Binary Heap
- Graph
- Hash Table( Associative manner)
+ Union: Union is same like structure, except, memory allocation i.e. size of
union is the size of max size member defined in it and that memory gets
shared among all its members for effective memory utilization (can be used
in a special case only).
Q. What is a Program?
- A Program is a finite set of instructions written in any programming
language (either in a high level programming language like C, C++, Java, Python
or in a low level programming language like assembly, machine etc...) given to the
machine to do specific task.
Q. What is an Algorithm?
- An algorithm is a finite set of instructions written in any human
understandable language (like english), if followed, acomplishesh a given task.
- Pseudocode : It is a special form of an algorithm, which is a finite set of
instructions written in any human understandable language (like english) with
some programming constraints, if followed, acomplishesh a given task.
- An algorithm is a template whereas a program is an implementation of
an algorithm.
- There are two types of Algorithms OR there are two approaches to write an algorithm:
1. Iterative (non-recursive) Approach :
Algorithm ArraySum( A, n){//whereas A is an array of size n
sum = 0;
for( index = 1 ; index <= n ; index++ ){
sum += A[ index ];
}
return sum;
}
e.g. iteration
for( exp1 ; exp2 ; exp3 ){
statement/s
}
exp1 => initialization
exp2 => termination condition
exp3 => modification
2. Recursive Approach:
While writing recursive algorithm -> We need to take care about 3 things
1. Initialization: at the time first time calling to recursive function
2. Base condition/Termination condition : at the begining of recursive function
3. Modification: while recursive function call
Example:
printf(“%4d”, n);
fun(n--);//rec function call
}
Space Complexity = Code Space + Data Space + Stack Space (applicable only in recursive
algorithms)
Code Space = space required for instructions
Data Space = space required for variables, constants & instance characteristics.
Stack Space = space required for FAR’s.
- When any function gets called, one entry gets created onto the stack
for that function call, referred as function activation record / stack
frame, it contains formal params, local vars, return addr, old
frame pointer etc...
In our example of recursive algorithm:
3 units (for A, index & n ) + 2 units (for constants 0 & 1) = total 5 units
of memory is required per function call.
- for size of an array = n, algo gets called (n+1) no. of times.
Hence, total space required = 5 * (n+1)
S = 5n + 5
=> S >= 5n
=> S >= 5n
=> S ~= 5n => O(n), wheras n = size of an array
# Time Complexity:
Time Complexity = Compilation Time + Execution Time
Time complexity has two components :
1. Fixed component : compilation time
2. Variable component : execution time => it depends on instance
characteristics of an algorithm.
Example :
Algorithm ArraySum( A, n){//whereas A is an array of size n
sum = 0;
for( index = 1 ; index <= n ; index++ ){
sum += A[ index ];
}
return sum;
}
- for size of an array = 5 => instruction/s inside for loop will execute 5 no. of times
- for size of an array = 10 => instruction/s inside for loop will execute 10 no. of times
- for size of an array = 20 => instruction/s inside for loop will execute 20 no. of times
- for size of an array = n => instruction/s inside for loop will execute “n” no. of times
# Scenario-1 :
Machine-1 : Pentium-4 : Algorithm : input size = 10
Machine-2 : Core i5 : Algorithm : input size = 10
# Scenario-2 :
Machine-1 : Core i5 : Algorithm : input size = 10 : system fully loaded with other processes
Machine-2 : Core i5 : Algorithm : input size = 10 : system not fully loaded with other
processes.
- It is observed that, execution time is not only depends on instance characteristics,
it also depends on some external factors like hardware on which algorithm is running as
well as other conditions, and hence it is not a good practice to decide efficiency of an algo
i.e. calculation of time complexity on the basis of an execution time and compilation time,
hence to do analysis of an algorithms asymptotic analysis is preferred.
# Asympotic Notations:
1. Big Omega (Ω) : this notation is used to denote best case time
complexity – also called as asymptotic lower bound, running time of
an algorithm cannot be less than its asymptotic lower bound.
3. Big Theta (θ) : this notation is used to denote an average case time
complexity - also called as asymptotic tight bound, running time of an
algorithm cannot be less than its asymptotic lower bound and cannot be
more than its asymptotic upper bound i.e. it is tightly bounded.
- As in each iteration 1 comparison takes place and search space is getting reduced by half.
n => n/2 => n/4 => n/8 ......
after iteration-1 => n/2 + 1 => T(n) = (n/21 ) + 1
after iteration-2 => n/4 + 2 => T(n) = (n/22 ) + 2
after iteration-3 => n/8 + 3 => T(n) = (n/23 ) + 3
Lets assume, after k iterations => T(n) = (n/2k ) + k ...... (equation-I)
let us assume,
=> n = 2k
=> log n = log 2k (by taking log on both sides)
=> log n = k log 2
=> log n = k (as log 2 ~= 1)
=> k = log n
By substituting value of n = 2k & k = log n in equation-I, we get
=> T(n) = (n / 2k ) + k
=> T(n) = ( 2k / 2k ) + log n
=> T(n) = 1 + log n => T(n) = O( 1 + log n ) => T(n) = O(log n).
1. Selection Sort:
- In this algorithm, in first iteration, first position gets selected and
element which is at selected position gets compared with all its
next position elements sequentially, if an element at selected
position found greater than any other position element then
swapping takes place and in first iteration smallest element gets
setteled at first position.
- In the second iteration, second position gets selected and element
which is at selected position gets compared with all its next
position elements, if an element selected position found greater
than any other position element then swapping takes place and in
second iteration second smallest element gets setteled at second
position, and so on in maximum (n-1) no. of iterations all array
elements gets arranged in a sorted manner.
2. Bubble Sort :
- In this algorithm, in every iteration elements which are at two
consecutive positions gets compared, if they are already in order
then no need of swapping between them, but if they are not in order
i.e. if prev position element is greater than its next position element
then swapping takes place, and by this logic in first iteration largest
element gets setteled at last position, in second iteration second largest
element gets setteled at second last position and so on, in max
(n-1) no. of iterations all elements gets arranged in a sorted manner.
3. Insertion Sort:
- In this algorithm, in every iteration one element gets selected as a key
element and key element gets inserted into an array at its appropriate
position towards its left hand side elements in a such a way that elements
which are at left side are arranged in a sorted manner, and so on, in max
(n-1) no. of iterations all array elements gets arranged in a sorted manner.
- This algorithm works efficiently for already sorted input sequence
by design and hence running time of an algorithm is O(n) and it is
considered as a best case.
Best Case : Ω(n) – if array elements are already arranged in a sorted manner.
Worst Case : O(n2)
Average Case: θ(n2)
- Insertion sort algortihm is an efficient algorithm for smaller input size array.
4. Merge Sort:
- This algorithm follows divide-and-conquer approach.
- In this algorithm, big size array is divided logically into smallest size (i.e. having size 1)
subarrays, as if size of subarray is 1 it is sorted, after dividing array into sorted smallest
size subarray’s, subarrays gets merged into one array step by step in a sorted manner and
finally all array elements gets arranged in a sorted manner.
- This algorithm works fine for even as well odd input size array.
- This algorithm takes extra space to sort array elements, and hence its space complexity
is more.
5. Quick Sort:
- This algorithm follows divide-and-conquer approach.
- In this algorithm the basic logic is a partitioning.
- Partitioning: in parititioning, pivot element gets selected first (it may be either
leftmost or rightmost or middle most element in an array), after selection of pivot
element all the elements which are smaller than pivot gets arranged towards as its
left as possible and elements which are greater than pivot gets arranged as its right
as possible, and big size array is divided into two subarray’s, so after first pass
pivot element gets settled at its appropriate position, elements which are at left of
pivot is referred as left partition and elements which are at its right referred as a
right partition.
Stack: It is a collection/list of logically related similar type elements into which data
elements can be added as well as deleted from only one end referred top end.
- In this collection/list, element which was inserted last only can be deleted first, so this
list works in last in first out/first in last out manner, and hence it is also called as
LIFO list/FILO list.
- We can perform basic three operations on stack in O(1) time: Push, Pop & Peek.
1. Push : to insert/add an element onto the stack at top position
step1: check stack is not full
step2: increment the value of top by 1
step3: insert an element onto the stack at top position.
2. Pop : to delete/remove an element from the stack which is at top position
step1: check stack is not empty
step2: decrement the value of top by 1.
3. Peek : to get the value of an element which is at top position without push & pop.
step1: check stack is not empty
step2: return the value of an element which is at top position
Stack Empty : top == -1
Stack Full : top == SIZE-1
# Applications of Stack:
- Stack is used by an OS to control of flow of an execution of program.
- In recursion internally an OS uses a stack.
- undo & redo functionalities of an OS are implemented by using stack.
- Stack is used to implement advanced data structure algorithms like DFS: Depth First Search
traversal in tree & graph.
- Stack is used in an algorithms to covert given infix expression into its equivalent postfix and prefix,
and for postfix expression evaluation.
Queue: It is a collection/list of logically related similar type of elements into which elements can be added from one end
referred as rear end, whereas elements can be deleted from another end referred as a front end.
- In this list, element which was inserted first can be deleted first, so this list works in first in first out manner, hence this
list is also called as FIFO list/LILO list.
- Two basic operations can be performed on queue in O(1) time.
1. Enqueue: to insert/push/add an element into the queue from rear end.
2. Dequeue: to delete/remove/pop an element from the queue which is at front end.
Applications of Queue:
- Queue is used to implement OS data structures like job queue, ready queue,
message queue, waiting queue etc...
- Queue is used to implement OS algorithms like FCFS CPU Scheduling, Priority
CPU Scheduling, FIFO Page Replacement etc...
- Queue is used to implenent an advanced data structure algorithms like BFS:
Breadth First Search Traversal in tree and graph.
- Queue is used in any application/program in which list/collection of elements
should works in a first in first out manner or whereaver it should works
according to priority.
- Binary tree: it is a tree in which each node can have max 2 number of child nodes,
i.e. each node can have either 0 OR 1 OR 2 number of child nodes.
OR
Binary tree: it is a set of finite number of elements having three subsets:
1. root element
2. left subtree (may be empty)
3. right subtree (may be empty)
Graph: Graph is a non-linear / an advanced data structure, defined as set of vertices and edges.
• Vertices (or Nodes) holds the data.
• Edges (or Arcs) represents relation between vertices.
• Edges may have direction and/or value assigned to them called as weight or cost.
• Applications of Graph:
• Electronic circuits
• Social media apps
• Communication network
• Road network
• Flight/Train/Bus services
• Bio-logical & Chemical experiments
• Deep Learning (Neural network, Tensor flow)
• Graph databases (Neo4j)
Graph: Graph is a non-linear / an advanced data structure, defined as set of vertices and edges.
• Vertices (or Nodes) holds the data.
• Edges (or Arcs) represents relation between vertices.
• Edges may have direction and/or value assigned to them called as weight or cost.
• Applications of Graph:
• Electronic circuits
• Social media apps
• Communication network
• Road network
• Flight/Train/Bus services
• Bio-logical & Chemical experiments
• Deep Learning (Neural network, Tensor flow)
• Graph databases (Neo4j)
u v u > v
(u, v) == (v, u) (u, v) != (v, u)
• If we can represent any edge either (u,v) OR (v,u) then it is referred as unordered pair of
vertices i.e. undirected edge.
• e.g. (u,v) == (v,u) => unordered pair of vertices => undirected edge => graph which
contanis undirected edges referred as undirected graph.
• If we cannot represent any edge either (u,v) OR (v,u) then it is referred as an unordered pair
of vertices i.e. directed edge.
• <u, v> != <v, u> => ordered pair of vertices => directed edge -> graph which contains
set of directed edges referred as directed graph (di-graph).
• Connected graph: B E
• From each vertex some path exists for every other vertex. A B
• Can traverse the entire graph starting from any vertex. E A
D
D F
C C
• Complete graph: F
• Each vertex of a graph is adjacent to every other vertex.
• Un-directed graph: Number of edges = v (v-1) / 2 B
• Directed graph: Number of edges = v (v-1) E
A
• Load Factor = n / m
• n = Number of key-value pairs to be inserted in the hash table
• m = Number of slots in the hash table
• If n < m, then load factor < 1
• If n = m, then load factor = 1
• If n > m, then load factor > 1
• Limitations of Open Addressing
• Open addressing requires more computation.
• Cannot be used if load factor is greater than 1 (i.e. number of pairs are more
than number of slots in the table).
rollno : int
marks : float
name : char [] / String / String
struct student
{
int rollno;
char name[ 32 ];
float marks;
};
class Employee
{
//data members
int empid;
String empName;
float salary;
//member functions/methods
};
Employee e1;
Employee e2;
=> to learn data structures is not learn any programming
language, it is a programming concept i.e. it is nothing
but to learn algorithms, and algorithms learned in data
structures can be implemented by using any programming
language.
City-1:
City-2:
- recursion
- recursive function
- recursive function call
- tail-recursive function
- non-tail recursive function
Class Employee
{
int empid;
String name;
float salary;
};
# Space Complexity:
for size of an array = 5 => index = 0 to 5 => only 1 mem
copy of index = 1 unit
for size of an array = 10 => index = 0 to 10 => only 1
mem copy of index = 1 unit
.
.
for size of an array = n => index = 0 to n => only 1 mem
copy of index = 1 unit
+ sum:
for size of an array = 5 => sum => only 1 mem copy of sum
= 1 unit
for size of an array = 10 => sum => only 1 mem copy of
sum = 1 unit
.
.
size
int sum( int n1, int n2 )//n1 & n2 are formal params
{
int res;//local var
res = n1 + n2;
return res;
FAR contains:
local vars
formal params
return addr => addr of next instruction to be executed in
its calling function
old frame pointer => an addr of its prev stack frame/FAR.
etc...
# Linear Search:
return false;
assumption-1:
if running time of an algo is having any additive /
substractive / divisive / multiplicative constant then it
can be neglected.
e.g.
O( n + 3 ) => O( n )
O( n – 2 ) => O( n )
O( n / 5 ) => O( n )
O( 6 * n ) => O( n )
# Binary Search:
n/2 / 2 => n / 4
n/4 / 2 => n / 8
.
.
T( n ) = n / 2k + k .... eq-I
lets assume, n = 2k
n = 2k
log n = log 2k .... [ by taking log on both sides ]
log n = k log 2
log n = k .... [ as log 2 ~= 1 ]
k = log n
1. Selection Sort:
assumption:
if running time of an algo is having a ploynomial then in
its time complexity only leading term will be considered.
e.g.
O( n3 + n2 + 5 ) => O( n3 ).
assumption:
if an algo contains a nested loops and no. Of iterations
of outer loop and inner loop dont know in advanced then
running time of such algo will be whatever time required
for statements which are inside inner loop.
for( i = 0 ; i < n ; i++ ){
}
+ features of sorting algorithms:
1. inplace => if a sorting algo do not takes extra space
(i.e. space required other than actual data ele’s and
constant space) to sort data elements in a
collection/list of elements.
After Sorting:
Output => 10 10’ 20 30 40 50 => stable
2. Bubble Sort:
# DAY-03:
3. Bubble Sort:
flag = false
iteration-0:
10 20 30 40 50 60
10 20 30 40 50 60
10 20 30 40 50 60
10 20 30 40 50 60
10 20 30 40 50 60
3. Insertion Sort:
Best Case:
Iteration-1:
10 20 30 40 50 60
10 20 30 40 50 60
no. of comparisons = 1
Iteration-2:
10 20 30 40 50 60
10 20 30 40 50 60
no. of comparisons = 1
Iteration-3:
10 20 30 40 50 60
10 20 30 40 50 60
no. of comparisons = 1
Iteration-4:
10 20 30 40 50 50
10 20 30 40 50 50
no. of comparisons = 1
Iteration-5:
10 20 30 40 50 50
10 20 30 40 50 50
no. of comparisons = 1
in best case total (n-1) no. of iterations are required
and in each iteration only 1 comparison takes place
total no. Of comparisons = 1 * (n-1) => n-1
T( n ) = O( n – 1 )
T( n ) = O( n ) => Ω( n ).
+ Linked List:
maintainability =>
searchAndDelete():
- priority queue can be implemented using linked list
searchAndDelete()
- BST => deleteNode()
addition();
addLast()
addFirst()
addAtPos()
- whatever basic operations (i.e. addition & deletion) we
applied on slll, all operations can be applied onto the
scll as it is, execpt we need to maintained / take care
about next part of last node always.
+ limitations of scll:
- in scll, addLast(), addFirst(), deleteLast() &
deleteFirst() operations are not efficient as it takes O(n)
time.
- we can traverse scll only in a forward direction
- prev node of any node cannot be accessed from it
class Employee{
//data members
//methods
}
class LinkedList{
int arr[ 5 ];
int top;
head => 44 33 22 11
OR
What is an expression ?
Combination of operands and operators
- there are 3 types of expression
1. infix expression : a+b
2. prefix expression : +ab
3. postfix expression : ab+
infix expression => a*b/c*d+e-f*g+h
int arr[ 5 ];
int front;
int rear;
arr : int []
front : int
rear : int
Circular Queue:
rear = 4, front = 0
rear = 0, front = 1
rear = 1, front = 2
rear++;
rear = rear + 1;
OR
head => 44
Lab Work:
1. implement dynamic queue
2. implement priority queue by using dcll
class Node{
int data;
Node next;
Node prev;
int priorityValue;
}
# Basic Data Structures: comfortable
# Advanced Data Structures:
- tree (can be implemented by using an array as well as
linked list).
- binary heap (array implementation of a tree)
- graph (array & linked list)
- hash table (array & linked list)
- merge sort & quick sort
+ Tree:
+ tree terminologies:
root node
parent node/father
child node/son
grand parent/grand father
grand child/grand son
ancestors => all the nodes which are in the path from
root node to that node
ii. Inorder (L V R) :
- start traversal always from root node
- in this traversal, we can visit any node only after
visiting its whole left subtree or its left subtree is
empty.
- in this traversal, all the nodes gets visited always in
a an ascending sorted order.
# DAY-08:
- queue : linear queue & circular queue
- tree : concept & definition
- tree terminologies
- addNode into the BST
inorder successor
inorder predecessor
Todays:
quick sort
m ort
tree concepts: complete binary tree, avl tree, balanced
BST, balace factor, threaded binary treem multi-way tree,
b-tree & b+ tree, binary heap
# partitioning
i=left;
j=right;
pivot = arr[ left ];
while( i < j ){
[ 60 50 40 30 20 10 ]
pass-1: [ 50 40 30 20 10 ] 60 [ RP ]
pass-2: [ 40 30 20 10 ] 50 [ RP ]
pass-3: [ 30 20 10 ] 40 [ RP ]
pass-4: [ 20 10 ] 30 [ RP ]
pass-5: [ 10 ] 20 [ RP ]
n * n => n2
[ 10 20 30 40 50 60 ]
pass=1: pivot = 10
[ LP ] 10 [ 20 30 40 50 60 ]
[ 20 30 40 50 60 ]
pass=2: pivot = 20
[ LP ] 20 [ 30 40 50 60 ]
pass-3: pivot = 30
[ 30 40 50 60 ]
[ LP ] 30 [ 40 50 60 ]
pass-4: pivot
[ 40 50 60 ]
n2
[ 60 50 40 30 20 10 ]
pass-1: [ 10 20 30 ] 40 [ 60 50 ]
pass-2: [ 10 ] 20 [ 30 ]
pass-3: [ 50 ] 60 [ RP ]
log n + log n
2 (log n) => O( n log n )
merge sort on linked list : to merge two already sorted
linked lists into a third list in a sorted manner
l3 => head => 10 -> 15 -> 20 -> 25 -> 30 -> 35 -> 40 ->
45 -> 50
google map app:
information about 1000’s of cities and info between paths
of those cities can be kept.
City
class City{
String cityName;
String cityCode;
String pinCode;
String state;
......
}
# DS_DAY-11:
+ Graph Traversal Algorithms:
1. dfs (depth first search) traversal - stack
2. bfs (breadth first search) traversal - queue