FDS - Unit II
FDS - Unit II
Unit II
1
Syllabus
Sequential Organization
Array
Array as an Abstract Data Type,
Operations on Array,
Merging of two arrays,
Storage Representation and their Address Calculation: Row major and
Column Major,
Multidimensional Arrays: Two-dimensional arrays, n-dimensional arrays.
Ordered List, Single Variable Polynomial: Representation using arrays,
Polynomial as array of structure, Polynomial addition, Polynomial
multiplication.
Sparse Matrix: Sparse matrix representation using array, Sparse
matrix addition, Transpose of sparse matrix- Simple and Fast Transpose,
2
Time and Space tradeoff.
Sequential Organization
What is a File , Record and Field ??
File operations
What is File organization ?
Sequential access file organization
3
What is a file ?
A file is a sequence/ collection of records.
Huge amount of data is stored in physical memory in the
form of files.
A file is a named collection of related information that is
recorded on secondary storage such as magnetic disks,
magnetic tables and optical disks.
A sequential file contains records organized by the order in
which they were entered. The order of the records is fixed.
Records in sequential files can be read or written only
sequentially.
New records are added at the end of the file.
Sequential output is also useful for printing reports.
4
File, Record and Field
5
File Operations
Operations on database files can be broadly classified into two
categories −
1. Update Operations
2. Retrieval Operations
Open − A file can be opened in one of the two modes, read
mode or write mode.
Locate/search
Read
Write
Close : This is the most important operation from the operating
system’s point of view. When a request to close a file is generated, the
operating system
Removes all the locks (if in shared mode),
Saves the data (if altered) to the secondary storage media, and
Releases all the buffers and file handlers associated with the file.
6
What is File Organization?
File Organization refers to the logical relationships among various
records that constitute the file,
In simple terms, Storing the files in certain order is called file
Organization
File organization refers to physical layout or a structure of record
occurrences in a file.
File organization determines the way records are stored and
accessed.
In many cases, all records in a file are of the same record type. If
every record in the file has exactly the same size (in bytes), the file is
said to be made of fixed-length records.
If different records in the file have different sizes, the file is said to be
made up of variable-length records.
7
File Organization
8
Sequential access file organization
Storing records in contiguous block within files on tape or disk is called as sequential
access file organization.
All records are stored in a sequential order. The records are arranged in the ascending or
descending order of a key field.
Sequential file search starts from the beginning of the file and the records can be added
at the end of the file.
In sequential file, it is not possible to add a record in the middle of the file without
rewriting the file.
Advantages of sequential file
It is simple to program and easy to design.
Disadvantages of sequential file
Sequential file is time consuming process.
It has high data redundancy.
Sorted file method is inefficient as it takes time and space for sorting records
9
Sequential access file organization
This method is quite simple, in which we store the records in
a sequence i.e one after other in the order in which they are
inserted into the tables.
10
Sequential access file organization
Insertion of new record –
11
Array
Array as an Abstract Data Type
Operations on Array
Merging of two arrays
Storage Representation and their Address Calculation:
Row major and Column Major
Multidimensional Arrays:
Two-dimensional arrays
n-dimensional arrays.
12
Overview of Array
Array a kind of data structure that can store a fixed-size
sequential collection of elements of the same type.
13
Overview of Array
14
Array Representation
16
Array as Abstract Data Types (ADT)
17
18
Array as Abstract Data Types (ADT)
19
Creating an Array and Printing
Array is created in Python by importing array module to the
python program.
Syntax :-
from array import *
arrayName = array(typecode, [Initializers])
20
Accessing Array Element
We can access each element of an array using the index of the element.
from array import *
arr = array('i', [10,20,30,40,50])
for i in arr:
print(i)
print("Access first three items individually")
print(arr[0])
print(arr[1])
print(arr[2]) OUTPUT
10
20
30
40
50
Access first three items individually
10
21 20
30
Insertion Operation
22
Deletion Operation
#search operation
index = eleSearch(arr, n, key)
if index != -1:
print ("element found at position: " + str (index + 1 ))
else:
24 print ("element not found")
Update Operation
Update operation refers to updating an existing element
from the array at a given index.
25
Merging of two arrays
26
Merging of two arrays
arr1 arr2
0 1 2 3 0 1 2 3
10 11 12 14 15 20 21 51
27
0 1 2 3 4 5 6 7
Merging of two arrays
def mergeArrays(arr1, arr2, n1, n2):
arraySize = n1+n2
arr3 = [None]*(arraySize)
#print(arr3)
i=0
j=0
k=0
# Traverse both array
while (i < n1 and j < n2):
# Check if current element of first array is smaller than current element of second array. If yes,
# store first array element and increment first array index. Otherwise do same with second array
if arr1[i] < arr2[j]:
arr3[k] = arr1[i]
k=k+1
i=i+1
else:
arr3[k] = arr2[j]
k=k+1
28
j=j+1
Merging of two arrays
# Store remaining elements of first array
while i < n1:
arr3[k] = arr1[i];
k=k+1
i=i+1
arr1=[]
num = int(input('How many element you want to enter ?: '))
for i in range(0, num):
print('Enter the', i, 'number: ')
n=int(input())
arr1.append(n)
arr2=[]
num1 = int(input('How many element you want to enter ?: '))
for i in range(0, num1):
print('Enter the', i, 'number: ')
n1=int(input())
arr2.append(n1)
30
index= mergeArrays(arr1, arr2, num, num1);
Storage Representation and their Address Calculation:
Single dimensional arrays
Two- dimensional arrays
Row major representation
Column Major representation
31
Address Calculation in single (one) Dimension
Array:
Formula
Address of an element say “A[ I ]” is calculated using the
following formula:
Address of A [ I ] = B + W * ( I – LB )
Where,
B = Base address
W = Storage Size of one element stored in the array (in byte)
I = Subscript of element whose address is to be found
LB = Lower limit / Lower Bound of subscript, if not specified
assume 0 (zero)
Example
Example:
Given the Base address of an array B[1300…..1900] as 1020
and size of each element is 2 bytes in the memory. Find the
address of B[1700].
Solution:
Address of A [ I ] = B + W * ( I – LB )
The given values are: B = 1020, LB = 1300, W = 2, I = 1700
Address of A [ I ] = B + W * ( I – LB )
= 1020 + 2 * (1700 – 1300)
= 1020 + 2 * 400
= 1020 + 800
= 1820
34
Address Calculation in Double (Two) Dimensional
Array:
While storing the elements of a 2-D array in memory, these are
allocated contiguous memory locations. Therefore, a 2-D array
must be linearized so as to enable their storage.
There are two alternatives to achieve linearization: Row-Major
and Column-Major.
36
Address of an element of any array say “A[ I ][ J ]” is calculated in two
forms as given:
38
Where,
B = Base address
I = Row subscript of element whose address is to be found
J = Column subscript of element whose address is to be found
W = Storage Size of one element stored in the array (in byte)
Lr = Lower limit of row/start row index of matrix, if not given assume 0
Lc = Lower limit of column/start column index of matrix, if not given
assume 0 (zero)
M = Number of row of the given matrix
N = Number of column of the given matrix
Important : Usually number of rows and columns of a matrix are given
( like A[20][30] or A[40][60] ) but if it is given as A[Lr- – –– Ur, Lc- –– – Uc].
In this case number of rows and columns are calculated using the following
methods:
Number of rows (M) will be calculated as = (Ur – Lr) + 1
Number of columns (N) will be calculated as = (Uc – Lc) + 1
And rest of the process will remain same as per requirement (Row Major Wise
or Column Major Wise).
0
1
40
Example:
Q 1. An array X [-15……….10, 15……………40] requires one byte of storage. If
beginning location is 1500 determine the location of X [15][20].
Solution:
The number of rows and columns are not given in the question. So they are calculated as:
Number or rows say M = (Ur – Lr) + 1 = [10 – (- 15)] +1 = 26
Number or columns say N = (Uc – Lc) + 1 = [40 – 15)] +1 = 26
(i) Column Major Wise Calculation of above equation
The given values are: B = 1500, W = 1 byte, I = 15, J = 20, Lr = -15, Lc = 15, M = 26
Address of A [ I ][ J ] = B + W * [ ( I – Lr ) + M * ( J – Lc ) ]
= 1500 + 1 * [(15 – (-15)) + 26 * (20 – 15)] = 1500 + 1 * [30 + 26 * 5] = 1500 + 1 * [160]
= 1660 [Ans]
(ii) Row Major Wise Calculation of above equation
The given values are: B = 1500, W = 1 byte, I = 15, J = 20, Lr = -15, Lc = 15, N = 26
Address of A [ I ][ J ] = B + W * [ N * ( I – Lr ) + ( J – Lc ) ]
= 1500 + 1* [26 * (15 – (-15))) + (20 – 15)] = 1500 + 1 * [26 * 30 + 5] = 1500 + 1 * [780
+ 5] = 1500 + 785= = 2285 [Ans]
Multidimensional Arrays: Two-dimensional
arrays, n-dimensional arrays.
Array of arrays
Two Dimensional array is an array that consists of more than
one rows and more than one column.
It is helpful if you want to store data in matrix form or solve
mathematical matrix problems.
Also used to implement other data structures like Linked
Lists, Graphs, Queues, Stack, Trees, etc.
In 2-D array each element is refered by two indexes.
The first index shows a row of the matrix and the second
index shows the column of the matrix.
Elements stored in these Arrays in the form of matrices.
42
Two-dimensional arrays
2darray.docx
44
Ordered List,
Single Variable Polynomial: Representation using
arrays, Polynomial as array of structure,
Polynomial addition.
Polynomial multiplication.
45
Ordered List
Arrays : An array is a data structure, which stores same type
of elements in a contiguous memory locations.
Further, it has a fixed size which is defined upon
initialization.
Ordered list : is an ordered collection of items,
Lists are dynamic arrays, which means that they have no fixed
size. List can grow and shrink as per user inputs.
List is a dynamic Data Structure that is capable of adding and
deleting new element during runtime.
46
Applications of Array
Polynomials and Sparse Matrix are two important
applications of arrays and linked lists.
A polynomial is composed of different terms
An essential characteristic of the polynomial is that each term
in the polynomial expression consists of two parts:
coefficient
exponent
Example:
10x2 + 26x, here 10 and 26 are coefficients and 2, 1 is its
exponential values.
47
Representation of Polynomial using Array
48
Array Representation of Polynomial
Polynomial : -10x0+3x1+5x2
Coefficient -10 3 5
Exponent 0 1 2
49
List Representation of Polynomial
50
Example : Addition of Polynomial using Array
Coefficient 5 8 9 2 1 2 4
Exponent 6 4 1 0 5 3 0
Compare
51
Addition of Polynomial using Array
Case I : If the exponent of the term pointed by i in X1 is
greater than the exponent of the current term of X2 pointed
by j, then copy the current term of X1 pointed by i in the
location pointed by K in poly X3. And increment ptr i & k to
next term.
If (x1[i].expo > x2[j].expo)
{
X3[k].coeff = x1[i].coeff ;
X3[k].expo = x1[i].expo ;
i = i++
K = k++
}
52
Addition of Polynomial using Array
Case II : If the exponent of the term pointed by i in X1 is less
than the exponent of the current term of X2 pointed by j, then
copy the current term of X2 pointed by j in the location pointed
by K in poly X3. And increment ptr j & k to next term.
While (j < n) do
{
X3[k].coeff = x2[j].coeff
X3[k].expo = x2[j].expo ;
j =j+1
K = k+1
55 }
Example #1 : Addition of Polynomial using Array
i=3 j=2
Coefficient
5 8 9 2 1 2 4
Exponent
6 4 1 0 5 3 0
Coefficient 5 1 8 2 9 6
Exponent 6 5 4 3 1 0
56
6 5 4 3
5x + x + 8x + 2x + 9x+ 6
Example #2 : Addition of Polynomial using Array
4 2
(7x +5x +3x) + (5x +3x-8)
3
Coefficient 7 5 5 6 -8
Exponent 4 3 2 1 0
Exp 0 1 2 3 0 1 2
Coef
0 3 5 7 10 3 5
i=0 j=0
0 1 2 3 4 5
k
0 30 59 100 46 35
0 1 2 3 4
X3 = 0x +30x +59x + 100x +46x +35x5
59
Multiplication of Polynomial using Array
1
X1 = 3x +5x + 7x
2 3 x2 = 10x0 + 3x1+5x2
X1 = 0x0+3x1 +x2 + 7x3 x2 = 10x0 + 3x1+5x2
Exp 0 1 2 3 0 1 2
Coef
0 3 5 7 10 3 5
i=0 j=0
0 1 2 3 4 5
0 30 59 100 46 35
0 1 2 3 4
X3 = 0x +30x +59x + 100x +46x +35x5
60
Multiplication of Polynomial using Array
3
X1 = 5x +9x +3x+ 2
2 x2 = 2x3 + 4x+5
Exp 0 1 2 3 0 1 2 3
Coef
i=0 j=0
0 1 2 3 4 5 6
X3 =
61
Multiplication of Polynomial using Array
3 2
X1 = 5x +9x +3x+ 2 x2 = 2x3 + 4x+5
X1 = 5x3 +9x2 +3x+ 2 x2 = 2x3 +0x2+ 4x+5
Exp 0 1 2 3 0 1 2 3
Coef 2 3 9 5 5 4 0 2
i=0 j=0
0 1 2 3 4 5 6
10 23 57 65 26 18 10
X3 =10x6+18x5+26x4+65x3+57x2+23x+10
62
Sparse Matrix:
What is Sparse matrix ?
Problems with Sparsity
Sparse matrix representation
Operations on sparse matrix-
Sparse matrix addition,
Transpose of sparse matrix-
Simple and Fast Transpose,
Time and Space tradeoff.
63
Sparse matrix
A matrix contains more number of ZERO values than NON-
ZERO values.
When a sparse matrix is represented with a 2-dimensional array,
we waste a lot of space to represent that matrix.
For example, consider a matrix of size 10 X 10 containing only 10
non-zero elements.
In this matrix, only 10 spaces are filled with non-zero values and
remaining spaces of the matrix are filled with zero. That means,
totally we allocate 10 X 10 X 4 = 400 bytes of space to store this
integer matrix.
And to access these 10 non-zero elements we have to make
scanning for 100 times. To make it simple we use the following
64
sparse matrix representation.
Dense matrix vs. Sparse matrix
65
Problems with Sparsity
66
Problems with Sparsity
Space Complexity
Very large matrices require a lot of memory.
An example of a very large matrix that is too large to be stored
in memory is a link matrix that shows the links from one website
to another.
An example of a smaller sparse matrix might be a word or term
occurrence matrix for words in one book.
In both cases, the matrix contained is sparse with many more
zero values than data values. The problem with representing
these sparse matrices is that memory is required and must be
allocated for each zero value in the matrix.
This is clearly a waste of memory resources as those zero values
67
do not contain any information.
Problems with Sparsity
Time complexity
Assuming a very large sparse matrix can be fit into memory,
we will want to perform operations on this matrix.
Simply, if the matrix contains mostly zero-values, i.e. no
data, then performing operations across this matrix may take
a long time where the bulk of the computation performed
will involve adding or multiplying zero values together.
This is a problem of increased time complexity of matrix
operations that increases with the size of the matrix.
This problem is compounded when we consider that even
trivial machine learning methods may require many
operations on each row, column, or even across the entire
68
matrix, resulting in vastly longer execution times.
Sparse Matrix Representations
A sparse matrix can be represented by using TWO
representations,
Triplet Representation (Array Representation)
Linked Representation
Triplet Representation (Array Representation)
In this representation, we consider only non-zero values
along with their row and column index values.
In this representation, the 0th row stores the total number of
rows, total number of columns and the total number of non-
zero values in the sparse matrix.
69
Array Representations of Sparse Matrix
For example, consider a matrix of size 5 X 6 containing 6
number of non-zero values. This matrix can be represented as
shown in the image...
70
Triplet Representations of Sparse Matrix
71
Linked Representations of Sparse Matrix
In linked list, each node has four fields. These four fields are
defined as:
Row: Index of row, where non-zero element is located
Column: Index of column, where non-zero element is located
Value: Value of the non zero element located at index – (row,
column)
Next node: Address of the next node
72
Linked Representations of Sparse Matrix
73
Operations on Sparse Matrix
Sparse matrix addition
1 2 3 4 5 6
1 2 3 4 5 6
1 0 0 0 6 0 0 1 0 0 0 0 0 0
2 0 7 0 0 0 0 2 0 3 0 0 5 0
3 0 2 0 5 0 0 3 0 0 2 0 0 7
4 0 0 0 0 0 0 4 0 0 0 9 0 0
5 4 0 0 0 0 0 5 8 0 0 0 0 0
1 2 3 4 5 6
1 0 0 0 6 0 0
2 0 10 0 0 5 0
3 0 2 2 5 0 7
4 0 0 0 9 0 0
74 5 12 0 0 0 0 0
Addition of Sparse Matrix
A B
R 5 1 2 3 3 5 5 2 2 3 3 4 5
C 6 4 2 2 4 1 6 2 5 3 6 4 1
V 5 6 7 2 5 4 6 3 5 2 7 9 8
5 1 2 2 3 3 3 3 4 5
6 4 2 5 2 3 4 6 4 1
75
6 10 5 2 2 5 7 9 12
Transpose of Matrix
Transpose of a matrix is obtained by interchanging rows and columns.
Element in the (i, j) position gets put in the (j, i) position.
Transpose of the matrix B1 is obtained as B2 by inserting (i,j)th
element of B1 as (j,i)th element in B2.
77
Fast Transpose of Sparse Matrix
As its name suggests, it is a faster way to transpose a sparse
and also a little bit hard to understand.
We require 2 arrays, namely, count and position.
Count the number of non zero elements in columns
Eg. If we have conventional matrix of size [3 x 4], then our
count array will be {0,1,2,3,4}
Starting position of each row in T from where 0th row starts,
from where 1st row starts and so on….
78
Fast Transpose of Sparse Matrix
T T’
79
Fast Transpose of Sparse Matrix
T T‘
Row Column Value Row Column Value
5 8 6 8 5 6
0 0 10 0 0 0 10
0 4 -5 1 0 3 7
1 2 1 2 1 4 9
2 4 -4 3 2 1 1
3 0 7 4 4 0 -5
4 1 9 5 4 2 -1
0 1 2 3 4 Row 0 1 2 3 4
2 1 1 0 2 Pos 0 2 3 4 4 5
80
Fast Transpose of Sparse Matrix
T T‘
Row Column Value Row Column Value
6 6 8 6 6 8
0 0 15 0 0 0 15
0 3 22 1 0 4 91
0 5 -15 2 1 1 11
1 1 11 3 2 1 3
1 2 3 4
2 5 28
2 3 -6 5 3 0 22
4 0 91 6 3 2 -6
5 2 28 7 5 0 -15
Count no. of elements per columns Starting pos of each row in T
Colum 0 1 2 3 4 5 Row 0 1 2 3 4 5
No. of 2 1 2 2 0 1 Pos 0 2 3 5 7 7
Ele
Fast Transpose of Sparse Matrix
T T‘
Row Column Value Row Column Value
6 6 8 6 6 8
0 0 15 0 0 0 15
0 3 22 1 0 4 91
0 5 -15 2 1 1 11
1 1 11 3 2 1 3
1 2 3 4
2 5 28
2 3 -6 5 3 0 22
4 0 91 6 3 2 -6
5 2 28 7 5 0 -15
Count no. of elements per columns Starting pos of each row in T
Colum 0 1 2 3 4 5 Row 0 1 2 3 4 5
No. of 2 1 2 2 0 1 Pos 0 2 3 5 7 7
Ele
0 5 0 6
2 0 4 0
0 0 7 0
Row Column Values
Row Column Value
3 4 5 0 0 1 2
0 1 5 1 1 0 5
0 3 6 2 2 1 4
1 0 2 3 2 2 7
1 2 4 4 3 0 6
2 2 7
0 1 2 3 0 1 2 3
1 1 2 1 0 1 2 4
83