0% found this document useful (0 votes)
2 views

Module 4 (4)

The document discusses time complexity, particularly focusing on searching and sorting algorithms. It explains linear and binary search algorithms, their time complexities, and provides examples of sorting techniques like bubble sort, selection sort, and insertion sort, along with their complexities. Additionally, it introduces hashing as a data storage technique, detailing hash functions and collision resolution methods.

Uploaded by

joel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Module 4 (4)

The document discusses time complexity, particularly focusing on searching and sorting algorithms. It explains linear and binary search algorithms, their time complexities, and provides examples of sorting techniques like bubble sort, selection sort, and insertion sort, along with their complexities. Additionally, it introduces hashing as a data storage technique, detailing hash functions and collision resolution methods.

Uploaded by

joel
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Module 4

Time complexity is defined as the amount of time taken by an algorithm to run, as a function of the
length of the input.

Time complexity is expressed in big-oh(O) notation

Searching

Searching is the process of finding some particular element in the list. If the element is present in the list,
then the process is called successful, and the process returns the location of that element; otherwise, the
search is called unsuccessful.

The most common searching algorithms are:

• Linear search

• Binary search

Linear Search

This is the simplest of all searching techniques. In this technique, an ordered or unordered list will be
searched one by one from the beginning until the desired element is found. If the desired element is found
in the list then the search is successful otherwise unsuccessful.

Linear search is also called as sequential search algorithm.

A simple approach to do a linear search

 Start from the leftmost element of ar[] and one by one compare x with each element of ar[]
 If x matches with an element, return the index of that element.
 If x doesn‟t match with any of elements, then return -1.

The time complexity of linear search is O(n).

Algorithm:

Let array a[n] stores n elements. Determine whether element „x‟ is present or not.

linsrch(a[n], x)

index = 0;
flag = 0;

while (index< n) do

if (x == a[index])

flag = 1;

break;

} index ++;

} if(flag == 1)

printf(“Data found at %d position“, index);

else

printf(“data not found”);

Now, let's see the working of the linear search Algorithm.

To understand the working of linear search algorithm, let's take an unsorted array. It will be easy to
understand the working of linear search with an example.

Let the elements of array are -

Let the element to be searched is K = 41

Now, start from the first element and compare K with each element of the array.
The value of K, i.e., 41, is not matched with the first element of the array. So, move to the next element.
And follow the same process until the respective element is found.

Now, the element to be searched is found. So algorithm will return the index of the element matched.

Program

#include<stdio.h>

void main()

int number[25], n, data, i, flag = 0;

printf("\n Enter the number of elements:");


scanf("%d", &n);

printf("\n Enter the elements:");

for(i = 0; i<n; i++)

scanf("%d", &number[i]);

printf("\n Enter the element to be Searched:");

scanf("%d", &data);

for( i = 0; i<n; i++)

if(number[i] == data)

flag = 1;

break;

} if(flag == 1)

printf("\n Data found at location:%d", i+1);

else

printf("\n Data not found");

BINARY SEARCH

Binary Search Algorithm is fast according to run time complexity. This algorithm works on the basis of
divide and conquer rule. In this algorithm we have to sort the data collection in ascending order first then
search for the targeted item by comparing the middle most item of the collection. If match found, the
index of item is returned. If the middle item is greater than the targeted item, the item is searched in the
sub-array to the left of the middle item. Otherwise, the item is searched for in the sub-array to the right of
the middle item. This process continues on the sub-array as well until the size of the sub array reduces to
zero.

The time complexity of linear search is O(logn).


Algorithm:

Step 1: set beg = lower_bound, end = upper_bound, pos = - 1


Step 2: repeat steps 3 and 4 while beg <=end
Step 3: set mid = (beg + end)/2
Step 4: if a[mid] = val
set pos = mid
print pos
go to step 6
else if a[mid] > val
set end = mid - 1
else
set beg = mid + 1
[end of if]
[end of loop]
Step 5: if pos = -1
print "value is not present in the array"
[end of if]
Step 6: exit

Example:

Let the elements of array are -

Let the element to search is, K = 56

We have to use the below formula to calculate the mid of the array -

1. mid = (beg + end)/2


So, in the given array -

beg = 0

end = 8

mid = (0 + 8)/2 = 4. So, 4 is the mid of the array.

Now, the element to search is found. So algorithm will return the index of the element matched.
Sorting

• The arrangement of data in a preferred order is called sorting.

• By sorting data, it is easier to search through it quickly and easily.

The three sorting techniques are

1. Bubble sort
2. Insertion sort
3. Selection sort

Bubble Sort

• It is the easiest and simplest of all the sorting algorithms.

• It works on the principle of repeatedly swapping adjacent elements in case they are not in the right order.

• In simpler terms, if the input is to be sorted in ascending order, the bubble sort will first compare the first
two elements in the array.

• In case the second one is smaller than the first, it will swap the two, and move on to the next element, and so
on

The Time complexity of bubble sort is O(n2).

Algorithm:

In the algorithm given below, suppose arr is an array of n elements. The assumed swap function in the
algorithm will swap the values of given array elements.

begin BubbleSort(arr)
for all array elements
if arr[i] > arr[i+1]
swap(arr[i], arr[i+1])
end if
end for
return arr
end BubbleSort
Example:

Let the elements of array are -

First Pass

Sorting will start from the initial two elements. Let compare them to check which is greater.

Here, 32 is greater than 13 (32 > 13), so it is already sorted. Now, compare 32 with 26.

Here, 26 is smaller than 36. So, swapping is required. After swapping new array will look like -

Now, compare 32 and 35.

Here, 35 is greater than 32. So, there is no swapping required as they are already sorted.

Now, the comparison will be in between 35 and 10.


Here, 10 is smaller than 35 that are not sorted. So, swapping is required. Now, we reach at the end of the
array. After first pass, the array will be -

Now, move to the second iteration.

Second Pass

The same process will be followed for second iteration.

Here, 10 is smaller than 32. So, swapping is required. After swapping, the array will be -

Now, move to the third iteration.

Third Pass

The same process will be followed for third iteration.


Here, 10 is smaller than 26. So, swapping is required. After swapping, the array will be -

Now, move to the fourth iteration.

Fourth pass

Similarly, after the fourth iteration, the array will be -

Hence, there is no swapping required, so the array is completely sorted.

Selection sort

The selection sort algorithm sorts an array by repeatedly finding the minimum element (for ascending order)
from unsorted part and putting it at the beginning.

The algorithm maintains two subarrays in a given array.

1) The subarray which is already sorted.

2) Remaining subarray which is unsorted.

In every iteration of selection sort, the minimum element from the unsorted subarray is picked and moved to
the sorted subarray.

The Time complexity of bubble sort is O(n2).

Algorithm:

• Let a be an array of n elements.


• Pass 1: find the position of I of the smallest element in the list of n elements a[0], a[1],..a[n-1] and
interchange a[i] with a[0]. Then a[0] is sorted.

• Pass 2: find the position of i of the smallest in the sub list of n-1 elements a[1],…a[n-1] elements and
interchange a[i] with a[1]. Then a[0], a[1] are sorted.

• Pass n-1: find the position of smallest of elements a[n-2],a[n-1] and then interchange a[i] with a[n-1]. Then
a[0], a[1],….a[n-1] are sorted.

Example:

Let the elements of array are -

Initially, the first two elements are compared in insertion sort.

Here, 31 is greater than 12. That means both elements are already in ascending order. So, for now, 12 is
stored in a sorted sub-array.

Now, move to the next two elements and compare them.

Here, 25 is smaller than 31. So, 31 is not at correct position. Now, swap 31 with 25. Along with
swapping, insertion sort will also check it with all elements in the sorted array.

For now, the sorted array has only one element, i.e. 12. So, 25 is greater than 12. Hence, the sorted array
remains sorted after swapping.
Now, two elements in the sorted array are 12 and 25. Move forward to the next elements that are 31 and
8.

Both 31 and 8 are not sorted. So, swap them.

After swapping, elements 25 and 8 are unsorted.

So, swap them.

Now, elements 12 and 8 are unsorted.

So, swap them too.

Now, the sorted array has three items that are 8, 12 and 25. Move to the next items that are 31 and 32.

Hence, they are already sorted. Now, the sorted array includes 8, 12, 25 and 31.
Move to the next elements that are 32 and 17.

17 is smaller than 32. So, swap them.

Swapping makes 31 and 17 unsorted. So, swap them too.

Now, swapping makes 25 and 17 unsorted. So, perform swapping again.

Now, the array is completely sorted.

Insertion sort
The idea behind the insertion sort is that first take one element, iterate it through the sorted array. Although it
is simple to use, it is not appropriate for large data sets as the time complexity of insertion sort in the average
case and worst case is O(n2)

Algorithm

The simple steps of achieving the insertion sort are listed as follows -

Step 1 - If the element is the first element, assume that it is already sorted. Return 1.
Step2 - Pick the next element, and store it separately in a key.

Step3 - Now, compare the key with all elements in the sorted array.

Step 4 - If the element in the sorted array is smaller than the current element, then move to the next
element. Else, shift greater elements in the array towards the right.

Step 5 - Insert the value.

Step 6 - Repeat until the array is sorted.

Example:

Let the elements of array are -

Initially, the first two elements are compared in insertion sort.

Here, 31 is greater than 12. That means both elements are already in ascending order. So, for now, 12 is
stored in a sorted sub-array.

Now, move to the next two elements and compare them.

Here, 25 is smaller than 31. So, 31 is not at correct position. Now, swap 31 with 25. Along with
swapping, insertion sort will also check it with all elements in the sorted array.

For now, the sorted array has only one element, i.e. 12. So, 25 is greater than 12. Hence, the sorted array
remains sorted after swapping.
Now, two elements in the sorted array are 12 and 25. Move forward to the next elements that are 31 and
8.

Both 31 and 8 are not sorted. So, swap them.

After swapping, elements 25 and 8 are unsorted.

So, swap them.

Now, elements 12 and 8 are unsorted.

So, swap them too.

Now, the sorted array has three items that are 8, 12 and 25. Move to the next items that are 31 and 32.
Hence, they are already sorted. Now, the sorted array includes 8, 12, 25 and 31.

Move to the next elements that are 32 and 17.

17 is smaller than 32. So, swap them.

Swapping makes 31 and 17 unsorted. So, swap them too.

Now, swapping makes 25 and 17 unsorted. So, perform swapping again.

Now, the array is completely sorted.

HASHING

Sequential search requires, on the average O(n) comparisons to locate an element. So many comparisons
are not desirable for a large database of elements. Binary search requires much fewer comparisons on the
average of O(log n) but there is an additional requirement that the data should be sorted. Even with the
best sorting algorithm, sorting of elements require O(log n) comparisons.
There is another widely used technique for storing of data called hashing. It does away with the
requirement of keeping data sorted and its best case time complexity is of constant order (O(1)). In its
worst case, hashing algorithm starts behaving like linear search.

Best case time complexity of searching using hashing = O(1)

Worst case time complexity using hashing= O(log n)

In hashing a record for a key value “key”, is directly referred by calculating the address from the key
value. Address or location of an element or record, x, is obtained by computing some arithmetic function
H. Such a function is called Hash function or hashing function.

H() address

The function H may not yield distinct values: it is possible that two different keys k1 and k2 will yield the
same hash address. This situation is called collision, and some method must be used to resolve it.
Accordingly, the topic of hashing is divided into two parts:
(1) Hash functions
(2) Collision resolutions.
1. HASH FUNCTIONS
The two principal criteria used in selecting a hash function H are as follows.
1. The function H should be very easy and quick to compute.
2. The function H should, as far as possible, uniformly distribute the hash addresses throughout the
set so that there are a minimum number of collisions.

Popular Hash functions are described below:

a) DIVISION METHOD
Choose a number m larger than the number n of keys in K. (The number m is usually chosen to be
a prime number or a number without small divisors, since this frequently minimizes the number
of collisions.) The hash functions H is defined by
H(k)=k(mod m) or H(k)=k(mod m)+1
Here k (mod m) denotes the remainder when k is divided by m. The second formula is used when
we want the hash addresses to range from 1to m rather than from 0 to m-1.
b) MID SQUARE METHOD
The key k is squared. Then the hash function H is defined by
H(k)=l
Where l is obtained by deleting digits from both ends of k2. We emphasize that the same
positions of k2 must be used for all of the keys.
c) FOLDING METHOD
The key k is partitioned into a number of parts, k1....., kr, where each part, except possibly the
last, has the same number of digits as the required address. Then the parts are added together,
ignoring the last carry. That is,
H(k)=k1+k2+.......+kr
Where the leading-digit carries, if any, are ignored. Sometimes, the even-numbered parts,
k2,k4,....., are each reversed before the addition.
EXAMPLE OF HASH FUNCTIONS
Consider the company in the above Example, each of whose 68 employees is assigned a unique 4-digit
employee number. Suppose L consists of 100 two-digit addresses: 00, 01, 02, ......, 99. We apply the
above hash functions to each of the following employee numbers:
3205, 7148, 2345
a. Division method
Choose a prime number m close to 99, such as m=97. Then
H(3205)=4, H(7148)=67, H(2345)=17
That is, dividing 3205 by 97 gives a remainder of 4, dividing 7148 by 97 gives a remainder of 67, and
dividing 2345 by 97 gives a remainder of 17. In the case that the memory addresses begin with 01 rather
than 00, we choose that the function H(k)=k(mod m)+1 to obtain:
H(3205)=4+1=5, H(7148)=67+1=68, H(2345)=17+1=18
b. Mid-square method
The following calculations are performed:
k: 3205 7148 2345
2
k : 10 272 025 51 093 904 5499 025
H(k): 72 93 99
Observe that the fourth and fifth digits, counting from the right, are chosen for the hash address.
c. Folding Method
Chopping the key k into two parts and adding yields the following hash addresses:
H(3205)=32+05=37, H(7148)=71+48=19, H(2345)=23+45=68
Observe that the leading digit 1 in H(7148) is ignored. Alternatively, one may want to reverse the second
part before adding, thus producing the following hash addresses:
H(3205)=32+50=82, H(7148)=71+84+55, H(2345)=23+54=77
2. COLLISION RESOLUTION
Suppose we want to add a new record R with key k to our file F, but suppose the memory location address
H(k) is already occupied. This situation is called collision. When the hash value of a key maps to an
already occupied bucket of the hash table, it is called as a Collision. Collision Resolution Techniques are
the techniques used for resolving or handling the collision.

a. Separate Chaining

This technique creates a linked list to the slot for which collision occurs. The new key is then inserted
in the linked list. These linked lists to the slots appear like chains. That is why, this technique is called
as separate chaining. If memory space is tight, separate chaining should be avoided. Additional
memory space for links is wasted in storing address of linked elements.
Example: Let us consider a simple hash function as “key mod 7” and sequence of keys as 50, 700,
76, 85, 92, 73, 101.
50 mod 7 = 1
700 mod 7 = 0
76 mod 7 = 6
85 mod 7 = 1
92 mod 7 = 1
73 mod 7 = 3

101 mod 7 = 3

b. Open Addressing

Separate chaining requires additional memory space for pointers. Open addressing hashing is an alternate
method of handling collision. Unlike separate chaining, all the keys are stored inside the hash table. No
key is stored outside the hash table.
Insert(k): Keep probing until an empty slot is found. Once an empty slot is found, insert k.
Search(k): Keep probing until slot‟s key doesn‟t become equal to k or an empty slot is reached.
Delete(k): Delete operation is interesting. If we simply delete a key, then the search may fail. So slots
of deleted keys are marked specially as “deleted”.
The insert can insert an item in a deleted slot, but the search doesn‟t stop at a deleted slot.
Techniques used for open addressing are-
 Linear Probing
 Quadratic Probing
 Double Hashing
(i) Linear Probing
In linear probing, whenever there is a collision, cells are searched sequentially for an empty
cell.
Example:
Let us consider a simple hash function as “key mod 7” and a sequence of keys as :
50, 700, 76, 85, 92, 73, 101.

Challenges in Linear Probing:


1. Primary Clustering: One of the problems with linear probing is primary clustering, many
consecutive elements form groups and it starts taking time to find a free slot or to search for an
element.
2. Secondary Clustering: Secondary clustering is less severe, two records only have the same
collision chain (Probe Sequence) if their initial position is the same.

(ii) Quadratic Probing


One way of reducing “Primary clustering” is to use quadratic probing to resolve collision.
Suppose the “key” is mapped to the location j and the cell j is already occupied. In quadratic
probing,it search the location j+i2 % size (for i=1,2,…). So it will search the locations j, (j+1),
(j+4), (j+9),... i.e.(j, (j+1*1), (j+2*2),(j+3*3)….are examined to find the first empty cell
where the key is to be inserted. This table reduces primary clustering.
Example: Let us take table size 11 and apply this technique with following elements-
29,18,43,10,46,54
29%11=7
0 10
18%11=7
1
43%11=10
2 46
10%11=10
3 54
46%11=2
4
54%11=10
5
6
7 29
8 18
9
10 43
After insertion of element 43 at 10th position in table, 10 will search the empty position at
(j+1) %11=0th position which is empty. When we insert 54 then after getting collision first it
will search the next position (j+1)%11=0th position which is already occupied, so it will again
search the next position (j+4)%11=3 position which is empty so 54 will be inserted at that
position.
(iii) Double Hashing
Double hashing is a collision resolving technique in Open Addressed Hash tables. Double
hashing uses the idea of applying a second hash function to key when a collision occurs.
Double hashing can be done using :
(hash1(key) + i * hash2(key)) % TABLE_SIZE
Here hash1() and hash2() are hash functions and TABLE_SIZE is size of hash table.
First hash function is typically hash1(key) = key % TABLE_SIZE
A popular second hash function is : hash2(key) = PRIME – (key % PRIME) where
PRIME is a prime smaller than the TABLE_SIZE.

You might also like