The Dummies Guide To Database Systems An Assembly of Information
The Dummies Guide To Database Systems An Assembly of Information
Clients
Server
Communication Network
Clients Clients
Server Server
Rosina S Khan
Dedicated to:
You, the Valued Reader
http://www.rosinaskhan.weebly.com
ii
Contents
Preface............................................................................................................................ vi
C H A P T E R 1 ................................................................................ 1
INTRODUCTION .......................................................................................................... 1
1.1 View of Data ........................................................................................................ 3
1.2 Instances and Schemas.......................................................................................... 6
1.3 Database Languages.............................................................................................. 6
1.4 Database Users ...................................................................................................... 7
1.5 Database Administrator......................................................................................... 8
C H A P T E R 2 ................................................................................. 9
ENTITY RELATIONSHIP MODEL ............................................................................. 9
2.1 Entity, Entity set and Attributes ............................................................................ 9
2.2 Types of attributes................................................................................................. 9
2.3 Relationship Set .................................................................................................. 11
2.4 Mapping Cardinalities......................................................................................... 11
2.5 Super key, Candidate key and Primary key ........................................................ 12
2.6 Entity-Relationship Diagram (ERD)................................................................... 13
2.7 ER-diagrams with different cardinality ratios..................................................... 14
2.8 Definition of Participation .................................................................................. 19
2.9 ER-diagram Problems ......................................................................................... 20
C H A P T E R 3 .............................................................................. 22
RELATIONAL MODEL .............................................................................................. 22
3.1 Converting ER diagrams to relational models .................................................... 22
3.2 Relational model Problems ................................................................................. 29
3.3 Query Languages ................................................................................................ 30
3.4 Relational Algebra .............................................................................................. 31
3.5 Outer Join ............................................................................................................ 37
3.6 The Division Operation....................................................................................... 39
3.7 Modification of the Database .............................................................................. 40
C H A P T E R 4 ............................................................................... 42
THE QUERY LANGUAGE SQL ................................................................................ 42
4.1 Introduction to SQL ........................................................................................... 42
4.2 SQL Expressions using functions ...................................................................... 44
4.3 SQL Expressions using Grouping....................................................................... 45
4.4 SQL Expressions using Sorting .......................................................................... 46
4.5 Searching for partial strings ................................................................................ 46
4.6 SQL Expressions concerning Input of Data........................................................ 47
4.7 SQL Expressions concerning deletion of tuples ................................................. 47
4.8 Update of tuples/values ....................................................................................... 47
4.9 Deletion of tables ................................................................................................ 47
4.10 Schema updates................................................................................................ 48
iii
4.11 Definition of a Domain .................................................................................... 48
4.12 Creation and Deletion of Views........................................................................ 49
4.13 The Rename Operation ..................................................................................... 49
4.14 UNION ALL ..................................................................................................... 50
4.15 INTERSECT & INTERSECT ALL............................................................... 50
4.16 Inner and Outer Joins ....................................................................................... 50
C H A P T E R 5 .............................................................................. 54
INTEGRITY CONSTRAINTS IN RELATIONAL SYSTEMS .................................. 54
5.1 Required Data ..................................................................................................... 54
5.2 Domain Constraints ............................................................................................ 54
5.3 Entity Integrity .................................................................................................... 55
5.4 Referential Integrity ............................................................................................ 56
5.5 Modification........................................................................................................ 59
C H A P T E R 6 .............................................................................. 60
FUNDAMENTAL DEPENDENCIES AND NORMALIZATION ............................. 60
6.1 Definition of functional dependency................................................................... 60
6.2 Definition of full functional dependencies ......................................................... 60
6.3 Normalization...................................................................................................... 61
6.4 Notes on Normalization ...................................................................................... 67
6.5 Quality Criteria for Relational Design ................................................................ 68
C HA P T E R 7 ................................................................................ 69
QUERY PROCESSING ............................................................................................... 69
7.1 Introduction to Query Processing ....................................................................... 69
C H A P T E R 8 ............................................................................... 72
FILE ORGANIZATION ............................................................................................... 72
8.1 Fixed- Length Records......................................................................................... 72
8.2 Variable-Length records ..................................................................................... 73
8.3 Byte-String Representation ................................................................................. 74
8.4 Slotted-page structure ......................................................................................... 74
8.5 Types of Record Organizations........................................................................... 76
8.6 Sequential File Organization.............................................................................. 76
8.7 Multitable Clustering File Organization ............................................................. 78
C H A P T E R 9 ............................................................................... 80
DATA-DICTIONARY STORAGE .............................................................................. 80
C H A P T E R 10 ............................................................................. 81
INDEXING ................................................................................................................... 81
10.1 Basic Concepts .................................................................................................. 81
10.2 Ordered Indices ................................................................................................. 82
10.3 B+-Tree Index Files .......................................................................................... 88
C H A P T E R 11 ............................................................................. 91
iv
TRANSACTIONS ........................................................................................................ 91
11.1 Syntax for Transactions ................................................................................... 91
C H A P T E R 12 ............................................................................. 93
RECOVERY IN TRANSACTIONS ............................................................................ 93
12.1 Problem Sources ............................................................................................... 93
12.2 Logging and Recovery from Software Failures ................................................ 93
C H A P T E R 13 ............................................................................. 96
CONCURRENCY CONTROL .................................................................................... 96
13.1 Problems Caused by Violation of Isolation ...................................................... 96
13.2 Serializability .................................................................................................... 97
13.3 Some Synchronization Protocols ...................................................................... 99
C H A P T E R 14 ........................................................................... 102
ADVANCED DATABASES.......................................................................................... 102
14.1 Distributed Databases ..................................................................................... 102
14.2 Data Warehouses............................................................................................. 106
14.3 Multimedia Databases..................................................................................... 107
14.4 Data Mining .................................................................................................... 108
14.5 What is a NoSQL (Not Only SQL) Database? ............................................... 108
MISCELLANEOUS DATABASE PROJECT ......................... 112
v
Preface
Nowadays there is a growing need to organize, store and retrieve information of any
organization real fast and convenient. Lots of paper files need manual work for updates
and throwing away paper clutter, retaining the importanrt ones. All of this can be done
smoothly and speedily by making the information computerized via databases. That is
what this book is about and it covers a fundamental approach to this topic which can be
taught and be useful for undergraduate courses in databases or any newbie who wants to
grab the concepts behind databases.
[2] Professor Dorothee Kochs Lecture notes, Stuttgart University of Applied Sciences,
Germany, 2005
[3] An Introduction to Database Systems, C. J Dates, Addison Wesley, 8th Edition, 2003
Organization
Chapter 1 is an introductory chapter starting with the idea of introducing backend and
front end of database systems and covers stuff from [1] about why databases are at all
useful.
Chapter 2 contains basics of Entity Relationship model (ERM), the fundamental step in
designing databases at the backend and the resources are mainly from [1] and [2].
Chapter 3 covers how to convert ERMs to relational models (table schemas) [2], and
introduces relational algebra, a pure and procedural form of query language [1].
Chapter 4 introduces Structured Query Language (SQL), the most widely used language
used for querying relational databases and retrieving info. [mainly [2] and partly [1])
vi
Chapter 6 consists of functional dependencies of one attribute (field) on another
attribute(s) in a table and based on these whether we have to split the tables or not
according to violation or not of normal forms based on the concept of normalization. [2]
Chapter 7 mainly covers query processing which is the series of activities involved in
extracting data from a database. [1]
Chapter 8 explains why the need to map databases to files may arise and how. [1]
Chapter 9 is a short explanation of a data dictionary and what info about databases it
contains. [1]
Chapter 10 includes sophisticated indexing techniques for files. Just as words or phrases
in a text book index appear in a sorted order, an index for a file in a database works in a
similar way. [1]
Chapter 11 introduces the concept of transactions which are a sequence of data access
operations that transfers the database from one consistent state to another consistent state.
[2]
Chapter 13 covers concurrency control which rectifies the problems occurring if two or
more transactions using the same data items are executed in parallel. [2]
There is also a miscellaneous database project at the very end which students can work
on through out the whole semester in parallel with theory lectures. The project is equally
useful for even those who are not students and can work on it in their own interests.
Acknowledme nts
Last but not the least I am thankful to my mom and Dr Manzur for all their help and
support while writing out this book.
--Rosina S Khan
vii
CHAP TE R 1
INTRODUCTION
Why interrelated data? Because some sort of relationship exists among the data. (It will
become clearer when we come to the topic Normalization of tables.) This actually occurs
in the backend. A backend is nothing but a collection of interrelated data in different
tables. A table has some fields in a row. These fields are also ca lled attributes.
Corresponding to the attributes are some data in rows. These are called records or tuples.
The entire table corresponds to an entity. We will cover more on entities and attributes in
the next chapter.
Why set of programs? Programs here refer to application programs at the front end or
interfaces connected to data at the backend. A typical layout for backend and front end is
given as an example below:
Accnt _A Accnt_B
100 250
FROM : Accnt_A
TO : Accnt_B
\ Transfer Amount 50
TRANSFER
1
A backend database for example, may consist of Jones Account Info table. After a
transaction, Jones Account Info may be updated as shown in Table 1.2. An interface to
the backend database may be depicted as shown in Fig 1.1. Given the values of source
and destination accounts as well as the transfer amount in textboxes, hitting the transfer
button will enable to make the transaction, which will in turn be updated in Jones
Account Info table in the backend.
The backend may be developed using software tools such as MS SQL Server, MySQL
etc. while the front end may be developed using C#, or PHP, JavaScript and HTML, etc.
Before the advent of databases, organizations stored information using a typical file-
processing system. In this system, permanent records were stored in various files and
different application programs were written to extract records and add records to the
appropriate files.
Data redundancy and inconsistency: Different programmers may write the files
and application programs over a passage of time. As a result, files may have
different structures and the programs may be written in different several
programming languages. Also, the same information may appear in different files.
Data inconsistency results when the same information is updated in one place but
not in another place in addition to higher storage and access cost.
Data Isolation: As data lie in different files and files may be in different formats,
it is difficult to retrieve the appropriate data by writing new application programs
each time.
Integrity Problems: The data values stored in the database must satisfy certain
conditions called integrity or consistency constraints. For example, the bank
balance of a customer must never fall below Tk200. Developers impose these
constraints by writing appropriate code in the various application programs. To
enforce a new constraint such as the bank balance of customers should not exceed
2
10 crore taka, it becomes difficult for the developer to enforce the new constraint
and change the programs. The problem worsens when constraints involve several
data items from different files.
Security problems: Every user of the database system should not be able to access
all the data. For example, in a banking system, payroll personnel (tax officer)
needs to see only that part of the database that has information about the various
bank employees. They do not need to access information about customer
accounts. As another example, bank tellers see only that part of the database that
has information on customer accounts. They cannot access information about
salaries of bank employees. Enforcing such security constraints on a file-
processing system is difficult because application programs are added to the
system in an ad hoc manner.
A major purpose of a database system is to provide users with an abstract view of the
data. That is, the system hides certain details of how the data are stored and maintained.
3
1.1.1 Data Abstraction
Since many database-system users are not handy with computers, developers hide certain
complexity details through several levels of abstraction in order to make it easier for
users interaction with the system.
Physical level: The lowest level of abstraction describes how data are actually
stored. It describes complex low-level data structures in detail.
Logical level: The next-higher level of abstraction describes what data are stored
in the database, and what relationships exist among those data. It describes the
entire database in terms of a small number of relatively simple structures.
Database administrators use the logical level of abstraction who must decide what
information to store in the database.
View Level: The highest level of abstraction describes only part of the entire
database. In this level users need to access only part of the entire database. The
level simplifies the interaction of users with the system. The system may provide
many views for the entire database.
View Level
Logical
Level
Physical
Level
4
Distinction among levels of abstraction may be compared to the concept of data types in
programming languages. Most high- level programming languages support the notion of a
structured type. For example, in a Pascal- level language, we may declare a record as
follows:
Code 1.1 defines a record type called customer with four fields. Each field has a name
and a type associated with it. A banking enterprise may have several such record types,
including
account, with fields account_number and balance
employee, with fields employee_name and salary
At the physical level, a customer, account, or employee record can be described as block
of consecutive storage locations or bytes. The compiler hides this level of detail from
programmers. Similarly, the database system hides many of the lowest- level storage
details from database programmers. Database administrator, however, may be aware of
certain details of the physical organization of the data.
At the logical level, each such record is described by a type definition as shown in Code
1.1, and the level also defines the interrelationships among the different record types.
Programmers using a programming language work at this level of abstraction. Similarly,
database administrators usually work at this level of abstraction too.
Finally at the view level, computer users see a set of application programs or front end
interfaces that hide details of the data types. At this level several views are defined and
database users see and access these views. The views also provide a kind of security
mechanism to allow users only to access certain parts of the database. This has been
explained in detail in the subsection Security problems on page 3. Some examples of
views can be:
View1: customer_name|account_number|balance
View2: employee_name|account_number|balance|salary
5
1.2 Instances and Schemas
Databases change over time as information is inserted and deleted. The collection of
information stored in the database at a particular moment is called an instance of the
database. The overall design of the database is called the database schema.
Employee Table:
In the above Employee Table, schema and instance are clearly shown.
The logical schema is the most important among all the schemas since programmers
construct application programs or front end interfaces by using logical schemas. The
physical schema is hidden beneath the logical schema, and can usually be changed easily
without affecting logical level. Hence, application programs do not need to be rewritten if
physical schema changes and are said to exhibit physical data independence.
6
a) Data-Manipulation Language
b) Data-Definition Language
Naive users are unsophisticated users who interact with the system
by invoking one of the application forms that have been written previously. An
example of an application program can be that of Jones Account Interface in
Fig. 1.1 where the nave users fill in the fields of the form and hit the button. They
may also simply read reports generated from the database
7
1.5 Database Administrator
A person who has central control over the whole database system is a database
administrator (DBA). His functions may be summarized as below
8
CHAP TE R2
The properties or parts of an entity are called attributes. For example, a person has the
attributes person_id, name, occupation, salary etc. A book has the attributes book_id,
author, publisher, category, number of copies etc. Attributes are actually the fields of a
database table or entity.
An entity set is a set of entities of the same type that share the sa me properties or
attributes. For example, the set of all students who take a class can represent a student
entity set in which each entity is a student sharing similar attributes with other students
such as student_id, name, contact_no, address etc.
Upper and lower bounds may be placed on the values in a multivalued attribute as
needed. For instance, a bank may limit storing the number of phone_numbers for
a customer to two. Placing bounds in this way means that the phone_number
attribute for the customer entity set may have the range 0<=phone_number<=2.
9
Null attributes: An attribute takes a null value when that value is missing, not
applicable or unknown. For example, a person may have no middle name (not
applicable). If the name for a particular customer is null we assume that the value
is missing since every customer must have a name. A null value for an
apartment_number could mean the address does not include an apartment number
(not applicable), that an apartment number exists but we do not know what it is
(missing), or that we dont know whether an apartment number is part of the
customers address (unknown).
Derived attributes: The value for this type of attribute can be derived from the
values of other related attributes or entities. For example, a customer entity has
the attribute customer_age. If the customer attribute also has an attribute
date_of_birth, we can calculate his age from data_of_birth and the current date.
Therfore, customer_age is a derived attribute. As another example, the entity
employee can have employment_length as an attribute. If this entity has another
attribute start_date of his employment, then we can calculate the employees
employment_length from the start_date and current_date. Hence, in this case
employment_length is a derived attribute.
Composite name
Attribute
Composite customer_street
Attribute
10
2.3 Relationship Set
A relationship set is a set of relationships of the same type. Consider two entity sets
customer and loan. We define the relationship set borrows to denote the association
between customers and bank loans that the customers have.
Mapping cardinalities are useful in describing binary relationships, although they can
contribute to the description of relationship sets that involve more than two entity sets. In
this section we shall concentrate on only binary relationship sets.
For a binary relationship set R between entities A and B, mapping cardinalities must be
one of the following:
11
a1 b1 b1
a1 b2
a2 b2
a3 a1 b3
b3
a1 b4
a4 b4
b5
(a) (b)
a1 a1 b1
b1 a2
a2 b2
a3 b2 a3 b3
a4 b3 a4 b4
a5
(a) (b)
A superkey is a set of one or more attributes that taken collectively allow us to identify
uniquely an entity in the entity set. For example, the customer_id attribute of the entity
set customer is sufficient to distinguish one customer entity from another. Thus,
customer_id is a super key. Similarly, the combination of customer_id and
customer_name is a superkey for the entity set customer. Thus, a minimal super key with
extraneous attributes is also a superkey. However only the customer_name attribute of a
customer is not a superkey, because several people might have the same name.
12
Thus, we now know that if K is a superkey, then so is any superset of K. But we are often
interested in minimal superkeys with no extraneous attributes. Such minimal superkeys
are called candidate keys.
Several distinct sets of attributes could serve as a candidate key so long they identify
uniquely an entity in the entity set. Consider the combination of customer_name and
customer_street. This set of attributes can distinguish uniquely among members of the
customer entity set. Therefore, both {customer_id} and {customer_name,
customer_street} are candidate keys. The latter is, however, not a super key.
Additionally, although the set {customer_id, customer_name} can distinguish customer
entities, their combination does not form a candidate key, since the attribute customer_id
alone is a candidate key. This set is, on the other hand, a super key.
We shall use the term primary key to denote a candidate key that is chosen by the
database designer as the principal means of identifying entities within an entity set.
Unlike super keys and candidate keys, there can always only be one way of representing
a primary key for the entities of the entity set. It is noteworthy that every entity set must
have a primary key to distinguish uniquely the entities within the entity set and to
distinguish from other entity sets.
loan_number
customer_street
customer_name
customer_city amount
customer_id m
1
customer borrows loan
Fig 2.4 shows an example of an entity-relationship diagram. Such a diagram must have
the following criteria:
13
Attributes of an entity represented by ellipses (e.g.
loan_number and amount attributes for the entity loan)
Relationship sets represented by diamonds (e.g. borrows
relationship set.)
Primary key attribute of an entity set underlined (e.g. primary
key customer_id underlined for entity customer)
Links joining diamond relationship set to entities. (e.g. there
are links joining the entities to the borrows relationship to complete the E-R
diagram.)
Cardinality Ratio that is as explained in sec 2.4. (e.g. fig 2.4
depicts a binary relationship between the two entities customer and loan. The
cardinality ratio from customer to loan is one-to-many because a customer can
borrow many loans from the bank but one loan from the bank can belong to one
customer assuming the loan is not joint i.e, an individual loan.)
budget famName
name
givenName
1
1
Department manages
Manager
The given ER diagram has cardinality ratio 1:1 because 1 department has 1 manager and
1 manager manages 1 department.
14
b)1-n relationships (one-to-many relations hips)
i)
famName
name
givenName
1 n
born in
Place Person
numberOfInhabitants dateOfBirth
The above ER diagram has cardinality ratio 1: n because in 1 place many persons are
born and 1 person is born in 1 place.
ii)
typename inventoryNr
n dateofPurchase
1
DeviceType belongs Device
to
manufacturer repairs
In the above ER diagram, DeviceType may be categories of tools while a device belongs
to one such category.
For example: DeviceType Device
Electric tools drill machine, electric knife, iron
Manual tools scissors, pliers, screwdriver
Considering the above ER diagram and the given example, we can say the ERD has
cardinality ratio 1: n because one DeviceType contains many devices but one device can
belong to one DeviceType only.
15
c) n-m relationships (many-to-many relationships)
i)
title
inventoryNr
nr
m dateOfPurchase
n
works
Project on Employee
budget repairs
workTime
The above ER diagram has cardinality ratio n : m because in 1 project many employees
can work and 1 employee can work in many projects.
Note: In this diagram, we see that the relationship works on has an attribute workTime.
In case of n- m relationships, the diamond relationship can have attributes if such
attributes cannot be directly assigned to the entities. For example, if the entity employee
is assigned workTime, we cannot find out how much time the employee works for which
project. On the other hand, if we assign workTime to Project entity, we cannot find out
which employee works for how much time for a particular project. It is further to be
noted that in case of binary relationships 1:1 and 1: n, the middle diamond relationship
cannot have any attributes.
ii)
famName
givenName
title
n m
Student takes Class
grade professor
subject matNr
16
The above ER diagram has cardinality ratio n : m because one student can take many
classes and in 1 class, there can be many students.
Note: For reasons similar to the previous example, the attribute grade is assigned to takes
relationship. Grade cannot be assigned to class because if we do, we lose the information
on which student gets a grade for a particular class and if we assign grade to student, we
lose the information on which class a particular student gets a grade.
nr name title nr
Part n m Project
usage
amount
k
Supplier
name address
In the above ER diagram, the cardinality ratio is n:m:k. One supplier can supply for one
project many parts. The same part can be supplied by one supplier for many projects. The
same part can be supplied for one project by many suppliers.
Note: In all ternary relationships, the middle diamond relationship can have attribute(s)
when those attribute(s) cannot be directly assigned to the entities as has been explained
in case of n- m binary relationships.
17
e) 1-1-1 ternary relationship (one-to-one-to-one ternary relationship)
name salary
budget name
Department 1 1 Manager
admin
Secretary
phone fax
The above ER diagram has cardinality ratio 1:1:1. Let us assume 1 department has 1
manager and 1 secretary. Now 1 department with 1 manager works with 1 secretary. 1
department with 1 secretary works with 1 manager. 1 manager together with 1 secretary
works in 1 department.
i)
nr n
Assembly construction
name
m
18
The above ER diagram comprises of products consisting of a hierarchy of assemblies.
One assembly contains other assemblies. One assembly is contained in other assemblies.
n
famName
Person parentage
givenName m
In the above recursive ER diagram, people are parents and children of other people.
If every instance of an entity is related to one or more instances of another relation via a
relationship, this is called total participation. If not, every instance of the entity is
related via the relationship, this is called partial participation
Example:
Project Employee
works
n on m
budget address
19
Not every employee manages a project, so Employee participates only partially in the
relationship manages, while Project may participate totally in manages (if projects
can only be defined if they have a manager).
Customers can deposit money in an account either checking or savings account. They can
borrow loans from the bank and pay them back in installments. More than one account
can be set up in a particular branch. Loans can be borrowed from the same branch.
Employees in the bank including a manager serve the customers for their specific
interests. Develop an ER diagram for the above application.
Solution:
customer_city
customer_name
customer_street
account_number
n m
n
Deposit Account
Customer
customer_id n n
balance type
date Serves
pay_no
Borrow m employee_id
Payment Employee
Account_
contact_no branch
n amount ename
date
designation
branch_name
Has m start_date 1
1 c 1
Loan_
Loan Branch
branch
n
20
ii) University Organization Proble m
Consider the following university database application. Every student has a matriculation
number (student id) and a name. A student takes a class in a room on a particular day. A
teaching assistant who is also a student of the university can take class(es) for a number
of hours not exceeding 80 hrs and is given a salary per month not exceeding Tk2000. In
addition to TAs, there are professors who can give lectures for more than one class but
one class lecture is given by one professor, who have a salary not exceeding Tk3 lacs. A
student at the end of the semester gets a grade for each class number (course).
Develop an ER diagram for the above application.
Solution:
grade
room day
matNr
sName Takes
m Class classNr
Student n
m n
n
TA
Gives
hours
grade 1
Professor
pname psalary
21
CHAP TE R 3
RELATIONAL MODEL
Model each entity as a separate relation. Add the primary key of one relation as a foreign
key (a copy of the primary key from the first relation, which may or may not be a part of
the primary key) to the other relation.
The above ER diagram can be modeled as a relational model or table schemas as follows:
Department:
Employee:
famName | givenName | address | salary
pk
22
3.1.2 One-many relations hip
Add the primary key of the relation with the 1 as foreign key to the relation with the
n.
givenName
famName
dName
1 n
worksIn
Department Employee
address salary
Employee:
Department:
dName | address
pk
Note: The primary key of the relation modeled from the relationship needs to be
determined after careful inspection
Example:
23
pNr type date
dName
pName
n m
operation m
Patient Doctor
address dNr
medication sideEffect
Patient:
Doctor:
dNr | dName
operation:
In operation relationship, date needs to be part of the primary key. The same patient can
be operated by the same doctor on two different dates for example. In that case values for
pNr and dNr become the same in two different tuples. We know that the value of primary
key has to be different in every tuple. So we need to add date as a part of the primary key
for the relation operation in order to make the key unique.
24
3.1.4 Many-many-many ternary relationship
Patient m MonthlyReport
n
admin
date
Doctor
dNr
dName
salary
Patient:
pNr | pName
Doctor:
MonthlyReport:
admin:
25
3.1.5 Arbitrary examples of othe r ternary relations hips
i) Many-many-one relationship
b
a
n m B
A D
C c
A: B: C: D:
a b c a|b|c
fk
a b
n 1 B
A D
C c
26
Converting the above ER diagram to a relational model:
A: B: C: D:
a b c a|b|c
fk fk
iii)One-one-one relationship
a)
a b
1 1 B
A D
C c
A: B: C: D:
a b c a|b|c
27
b) Alternative relational model for one-one-one relationship
a b
did
1 1 B
A D
C c
A: B: C: D:
a b c did | a | b | c
fk fk fk
Alternatively, if the relationship D has already a primary key did, it can be modeled as a
relation by taking primary keys a, b, c from entities A, B and C respectively as foreign
keys.
Note: As with many- many binary relationship, with ternary relationships, the primary
key of the relation modeled from the relationship needs to be determined after careful
inspection.
28
3.2 Relational model Proble ms
Referring to the ER model for Banking Enterprise Problem in section 2.9(i), it can be
converted to a relational model as follows:
Custome r:
Branch
Account:
Deposit:
customer_id | account_number
Loan:
customer_id | loan_number
Payment:
Employee:
Serves:
29
3.2.2 The University Organization Problem
Student:
matNr | sName
Professor:
pname | psalary
Class:
Takes:
TA:
A query language is a language in which a user requests information from the database.
These languages can be categorized as either procedural or non-procedural. In a
procedural language, the user instructs the system to perform a sequence of operations on
the database to compute the desired result. In a non-procedural language, the user
describes the desired information without giving a specific procedure for that
information.
Most commercial relational database systems offer a query language that includes
elements of both the procedural and nonprocedural approaches. The most widely used
query language SQL (Structured Query Language) is such a query language.
There are a number of pure query languages. The relational algebra is procedural, where
as the tuple relational calculus and domain relational calculus are nonprocedural. These
query languages are terse and formal, lacking the syntactic sugar of commercial
30
languages, but they illustrate the fundamental techniques for extracting data from the
database.
The select operation selects tuples that satisfy a given condition. The select operation is
denoted by the Greek letter sigma () The condition appears as a subscript to . The
relation is enclosed in parenthesis after the .
Considering the Banking Enterprise problem, if we want to select those tuples of the loan
relation where the branch is Kakrail, we write
(Loan)
branch_name=kakrail
=kakrail
Another problem to find all tuples of loan relation in which the amount lent is more than
Tk 1000 can be solved as :
amount>1000 (Loan)
In general, we allow comparisons using =, !=, <, <=, >, >= in the selection condition.
To find those tuples of loan relation in which the amount lent is more than Tk1000 made
by Kakrail branch, we write
(Loan)
branch_name=kakrail amount>1000
The project operation helps to filter out some of the attributes of a relation. In other
words we can select part of the attributes of a relation using project operation. Projection
is denoted by the uppercase Greek letter pi (). We list those attributes of a relation we
wish to appear in the result as a subscript to . The relation appears in parenthesis after
the projection.
If we wish to list all loan numbers and amount of the loans for the loan relation, we write:
(Loan)
loan_number, amount
31
3.4.3 Composition of Relational Operations
Consider the more complicated query Find those customers who live in Dhaka city. We
write:
customer_name
( customer_city = Dhaka
(Customer ))
This query is a composition of the relational operations of both select and project.
Consider a query to find the names of all bank customer ids who have either an account
or a loan or both. Note that the customer relation does not contain the information but to
answer this query, we need to extract information from both the deposit and borrow
relations:
We know how to find the names of all customers with a loan in the bank:
(Borrow)
customer_id
We also know how to find the names of all customers with an account in the bank:
customer_id (Deposit)
To answer the query we need the union of these two sets; that is, we need all customer
names that appear in either or both of the two relations. So we write:
The set-difference operation, denoted by -, allows us to find tuples that are in one relation
but are not in another. The expression r-s produces a relation containing those tuples in r
but not in s.
We find all customer ids of the bank who have an account but not a loan by writing:
32
3.4.6 The Set-Intersection Operation
The set- intersection operation is denoted by . Suppose that we wish to find all
customers who have both a loan and an account. Using set intersection, we can write
We have dropped relation- name prefixes from those attributes that appear in only one of
the two schemas and with the simplified relation sc hema, we can distinguish
borrow.loan_number from loan.loan_number.
Now consider sample values in the two relations borrow and loan.
Borrow:
customer_id loan_number
01 L-15
02 L-16
03 L-17
Loan:
33
Now r=borrow x loan would be:
Thus we see that in the Cartesian product of borrow and loan relations, every tuple of
borrow relation is combined with every tuple of loan relation.
Suppose we want to find the customer ids with loans at Kakrail Branch.
First we need to find that in Kakrail branch customers have a loan in the bank. We write:
(Borrow x Loan)
branch_name=kakrail
=kakrail
Since the Cartesian-product operation combines every tuple of loan with every tuple of
borrow, we know that, if a customer has a loan in the Kakrail branch, then there is some
tuple in borrow x loan that contains his id, and borrow.loan_number = loan.loan_number.
So if we write,
(Borrow x Loan)
branch_name=kakrail borrow.loan_number=loan.loan_number
we get only those tuples of borrow x loan that pertain to customers who have a loan at the
Kakrail branch.
( (Borrow x Loan))
customer_id
branch_name=kakrail borrow.loan_number=loan.loan_number
. [1]
.
The result of this expression, shown in Table 3.1, is the correct answer to the query.
34
Table 3.1: The result of expression [1]
customer_id
01
Unlike relations in the database, the results of relational-algebra expressions do not have
a name that we can use to refer to them. It is useful to be able to give them names by the
rename operator, denoted by the lowercase Greek letter rho (); let us do this. Given a
relational-algebra expression E, the expression
x (E)
We can also apply the rename operation to a relation r to get the same relation under a
new name.
x (A1,A2.An) (E)
returns the result of Expression E under the name x, and with the attributes renamed to
A1, A2.An.
To illustrate renaming a relation, we consider the query Find the largest account balance
in the bank. Our strategy is to (1) compute first a temporary relation consisting of those
balances that are not the largest and (2) take the set difference between the relations:
We shall use the rename operation to rename one reference to the account relation. Thus,
we can reference the relation without ambiguity.
We can now write the temporary relation that consists of balances that are no t the
largest..
The query to find the largest account balance in the bank can be written as:
35
balance (account) -
It is often desirable to simplify certain queries that require a Cartesian product. Usually, a
query that involves a Cartesian product includes a selection operation on the result of the
Cartesian Product. Consider the query Find the names of all customers who ha ve a loan
at the bank, along with the loan number and the loan amount.
The natural join is a binary operation that allows us to combine certain selections and a
Cartesian product into one operation. It is denoted by the join symbol . The natural-join
operation forms a Cartesian product of its two arguments, performs a selection forcing
equality on those attributes that appear in both the relation schemas and finally removes
duplicate attributes.
(Borrow Loan)
customer_name, loan_number, amount
Find the names of all branches with customers who have an account in the bank
and who live in Dhaka.
The natural join operation on the three relations can be executed in any order.
Find all customers who have both a loan and an account at the bank.
( Borrow Deposit)
customer_name
36
Note that in section 3.46 we wrote an expression for this query by using set intersection.
We repeat this expression here.
The outer join is an extension of the join operation to deal with missing information.
Suppose that we have the relations with the following schemas, which contain data on
full-time employees.
Consider the employee and fullt_works relations in Tables 3.2 snd 3.3 respectively.
Suppose that we want to generate a single relation with all the information (street, city,
branch_name and salary) about full- time employees. A possible approach would be to
use the natural join operation as follows:
employee fullt_works
37
Table 3.4: employee fullt_works
Note that we have lost the street and city information about Shumon, since the tuple
describing Shumon is absent from the fullt_works relation; similarly, we have lost the
branch_name and salary information about Swarna, since the tuple describing Swarna is
absent from the employee relation.
We can use the outer join operation to avoid this loss of information. There are actually
three forms of the operation: left outer join, denoted by, ( ); right outer join,
All three forms of outer join compute the join, and add extra tuples to the result of the
fullt_works and employee fullt_works appear in Tables 3.5, 3.6 and 3.7
respectively.
The left outer join takes all tuples in the left relation that did not match with any tuple in
the right relation, pads the tuples with null values for all other attributes from the right
relation, and adds them to the result of the natural join. In fig 3.5 tuple (Shumon, New
Market Rd, Chittagong, Null, Null) is such a tuple. All information from the left relation
is present in the result of the left outer join.
The right outer join is symmetric with the left outer join. It pads tuples from the right
relation that did not match any from the left relation with nulls and adds them to the
result of the natural join. In fig 3.6 tuple (Swarna, Null, Null, Rampura, 20000) is such a
tuple.
The full outer join does both of these operations, padding tuples from the left relation that
did not match any from the right relation, as well as tuples from the right relation that did
not match any from the left relation, and adding them to the result of the join. Fig 3.7
shows the result of a full outer join.
38
Table 3.5: employee fullt_works
The division operation in relational algebra, denoted by , is suited to queries that include
the phrase for all.
Suppose that we wish to find all customers who have an account at all branches located in
Dhaka.
39
We can find all (customer_name, branch_name) for which the customer has an account at
a branch by writing
We need to find customers who appear in r2 with every branch name in r1. The operation
that provides exactly those customers is the divide operation that is, the customers who
have an account for all branches in Dhaka city.
r2 r1
3.7.2 Insertion
Suppose we wish to insert the fact that Smith has Tk1200 in account A-973 at Kakrail
branch. We write:
3.7.3 Updating
Suppose that interest rates are being made and that all branches are to be increased by
5%. For this, we require an update operation.
40
account <- account _number, branch_name, balance * 1.05 (account)
Suppose that accounts with balances over Tk10,000 receive 6% interest, where as all
others receive 5%.
41
CHAP TE R4
SQL stands for Structured Query Language. SQL is today the most widely used query
language for relational databases.
Each column/attribute has a specific data type. Basic data types in relational systems are
as follows:
We start with some SQL examples related to the University Organization Problem.
Query Expression
42
After FROM, all tables must be listed that are necessary for finding the query
result.
After WHERE, all constraints (formulas) for joins and selections (conditions) in
the query must be listed.
Query Expression:
Query Expression:
Q3 Display the names and salaries of professors whose salary is > than 1 lac.
Query Expression:
Q4 Find the matNrs of all students younger than the oldest student named Philips.
Query Expression:
Q5 Find the names of all students who took CSE303, but who were not yet assigned a
grade.
Query Expression
SELECT sName
FROM Student, Takes
WHERE Student.matNr = Takes.matNr
AND Takes.classNr = CSE303
AND Takes.grade = NULL
43
Q6 Find all students who took the course CSE 303 or worked as a TA for this class.
Query Expression
Q7 Find all names of students who took any class not given by Professor Ja mes.
Query Expression:
SELECT sName
FROM Student
WHERE matNr IN
(SELECT matNr
FROM Takes
WHERE classNr NOT IN
(SELECT classNr
FROM class
WHERE pname= James))
Functions:
AVG() ( = average)
MAX()
MIN()
COUNT() (= number of tuples)
SUM()
DISTINCT() (= list every result value just once; remove
duplicates)
44
Q1 Find the number of tuples in table Professor.
Q2 List the sum of all professors salaries, the highest, the smallest, and the average
salary in the table.
Q3 What is the number of different classes for which teaching assistants work?
Q1 List for every class the classNr, the number of students who took the class, and the
average exam grade.
The group by clause is going to group on the classNrs meaning there will be two groups
of classNrs : CSE 303 and CSE 304, and for each group the classNr, number of students
who took the class and average grade will be listed. So the result will be:
GROUP BY must always be used whenever the Select part contains attributes with
aggregate functions and attributes to which no aggregate functions are applied. In these
cases all attributes without aggregate function must be listed in the GROUP BY clause.
45
Q2 List for every class the classNr, the number of students who took the class, and the
average exam grade where the grade average is better than 3.
Using ORDER BY in the above expression, the grades will be listed in descending order.
The default is ASC (ascending order).
The operators LIKE and NOT LIKE use wild card characters.
% stands for an arbitrary number (0 or more) of arbitrary
characters.
_ stands for exactly one arbitrary character
Q2 Find all students whose names do not end with the letter m.
46
4.6 SQL Expressions concerning Input of Data
INSERT INTO Class (pname, classNr) VALUES (Jane, CSE 303), (David, CSE
304)
Values that are not specified are automatically either set to NULL or to a default value.
Examples:
Examples:
Update Class
SET room=7B03, day= Friday
WHERE classNr=CSE 304
UPDATE TA
SET tasalary = 1.1 * tasalary
If CASCADE is specified:
47
All views and references from other tables that refer to this table are also deleted.
If RESTRICT is specified:
Deletion is only executed if the table is not referenced by other tables or views.
Example:
Dropping a column:
Example:
A domain is essentially a data type with optional constraints (restrictions on the allowed
set of values).
Examples:
48
CREATE DOMAIN MatNr AS CHAR(6);
The second example allows to define restrictions on the values in the desired domain
Gende rType.
Example:
Normally, views are not stored like base tables but are generated just in time when a
query accesses the view. Some systems allow storing views for performance reasons.
Referring to the Bank Enterprise, here are some sample SQL queries and corresponding
SQL expressions:
Q For all customers who have a loan from the bank, find their names, loan numbers and
loan amount.
49
Select customer_name, T.loan_number, S.amount
From borrow AS T, loan as S
WHERE T.loan_number = S.loan_number
Q Find all bank customers having a loan, an account or both at the bank.
We can write the expressiom for the above query using a normal UNION operation as:
If we want to retain all duplicates, we must write UNION ALL in place of UNION as:
50
Table 4.1: Loan relation
customer_name loan_number
Kamal L-170
Tinku L-230
Jones L-155
The expression computes the normal join of loan and borrow relations with the join
condition being loan.loan_number = borrow.loan_number
We may rename the result relation of a join and the attributes of the result relation by
using an as clause as shown below:
The following table shows the result of the above left outer join.
51
Table 4.4: The result of loan left outer join borrow on loan.loan_number =
borrow.loan_number
In the resultant relation, the tuples (L-170, Kakrail, 3000) and (L-230, Motijheel, 4000)
from loan join with tuples from borrow and appear in the result of the inner join and
hence in the result of the left outer join. On the other hand the tuple(L-260, Dhanmondi,
1700, NULL, NULL) is present in the result of the left outer join.
This expression computes the natural join of the two relations. The only attribute name
common to loan and borrow is loan_number and it appears once in the result unlike the
result of the join with the on condition.
The right outer join is symmetric to the left outer join. Tuples from the right-hand-side
relation that do not match any tuples in the left-hand-side relation are padded with nulls
and are added to the result of the right outer join.
Here is an example of combining the natural- join condition with the right outer join type:
Table 4.5: The result of loan natural right outer join borrow
Now we consider a full outer join operation. For example, the following table shows the
result of full outer join expression.
52
Table 4.6: The result of loan full outer join borrow using (loan_number)
As another example, we can write the query, Find all customers who have either an
account or loan but not both at the bank with natural full outer join as:
53
CHAP TE R 5
In relational systems, usually the following types of integrity constraints are considered:
Some attributes must contain a valid value. For example, we could assume that every
student must have a name in the database. There cannot be student tuples where the name
is not given.
This constraint can be defined in the CREATE TABLE statement by adding NOT NULL
after the data type of the attribute.
Example:
The database then will not accept the input of tuples where a
name is not given.
Example:
54
CREATE TABLE Student
( sName Varchar(20) NOT NULL,
matNr Integer NOT NULL,
gender genderType);
The primary key of a table must contain a unique, non-null value for each row. This can
be ensured by adding the phrase PRIMARY KEY(A1,An) to a CREATE TABLE
statement. It can only be used only once for each table.
Example:
CREATE TABLE TA
(matNr Integer NOT NULL,
classNr CHAR(10) NOT NULL,
hours Integer,
tasalary Double)
PRIMARY KEY(matNr, classNr)
It is also possible to enforce uniqueness and presence of a defined value (other than
NULL) for other attributes, for instance alternate keys, by using the expressions
UNIQUE and NOT NULL.
Example:
CREATE TABLE TA
( matNr Integer NOT NULL,
classNr Char(10) NOT NULL,
hours Integer,
tasalary Double)
PRIMARY KEY(matNr, classNr)
UNIQUE(classNr)
This would mean that the database accepts only one TA for each class.
55
5.4 Referential Integrity
Relationships are modeled in the relational model by using foreign keys. Refere ntial
integrity means that, if the foreign key contains a value, the value must refer to an
existing, valid row in the table it refers to.
The definition of foreign keys is supported with the FOREIGN KEY clause in the
CREATE TABLE statement.
Example:
CREATE TABLE TA
(matNr Integer NOT NULL,
classNr Char(10) NOT NULL,
hours Integer,
tasalary Double)
PRIMARY KEY (matNr, classNr)
FOREIGN KEY (matNr) REFERENCES Student (matNr)
FOREIGN KEY (classNr) REFERENCES Class (classNr)
The database then rejects any INSERT or UPDATE operation that attempts to create a
foreign key value without a matching value in the table to which it refers.
The action that the DBMS takes for any UPDATE or DELETE operation that concerns a
value that matches foreign key values in another table is dependent on the referential
action specified using the ON UPDATE and ON DELETE subclauses of the FOREIGN
KEY clause, discussed as follows.
DELETION
When a tuple with a primary key value is deleted, the following actions can be performed
on the tuples with the corresponding foreign key values:
a. CASCADED DELETE
All tuples whose foreign key values equal the deleted primary key values are also
deleted.
(Example: when a tuple in student is deleted, matching foreign key tuples in Takes
should also be deleted)
Example:
56
CREATE TABLE Takes
(matNr Integer NOT NULL
classNr Varchar(10) NOT NULL
FOREIGN KEY (matNr) REFERENCES Student (matNr)
ON DELETE CASCADE
b. RESTRICTED DELETE
As long as tuples exist with foreign key values that match the primary key value which is
to be deleted, the deletion is not accepted.
(Example: a department record may not be deleted as long as there are tuples with
employees belonging to this department). This is the default setting unless no action on
delete is specified.
c. No Action
With the clause ON DELETE NO ACTION it is specified that a deletion of a tuple with
the primary key does not invoke any actio n on the tuples(s) with the concerned foreign
key value.
Example:
d. Nullification
All foreign key values that match the deleted primary key value are set to NULL.
(Example: when a professor is deleted, their name is set to NULL in the Class tuples)
Example:
57
room Integer,
day Date,
pname VarChar(30))
PRIMARY KEY (classNr)
FOREIGN KEY (pname) REFERENCES Professor (pname)
ON DELETE SET NULL
e. Default
Only the foreign key values are not set to NULL but to a default value.
For example,
Staff
DegreePrograms
58
In above tables studentCounsellor in DegreePrograms table is a foreign key to the
primary key persId in Staff table. The default studentCounsellor is Dean of Engineering.
If studentCounsellor Henry with persId 003 is deleted from Staff, then the default
studentCounsellor would be the Dean with persId as 123. The result of this action is
shown in the following tables:
Staff
persId persname Designatiom
001 John Head
002 Mary Head
003 Henry Head
123 David Dean
DegreePrograms
programId Program studentCounsellor
01 CSE 001
02 EEE 002
03 CE 003 123
5.5 Modification
When a primary key value is updated, the following actions can be performed on the
tuples with the corresponding foreign key values. All cases are analogous to the case of
deletion.
Note:
It is possible to combine for the same attribute different types of actions for delete and
update, for example adding to a foreign key definition the clause ON UPDATE SET
DEFAULT ON DELETE CASCADE.
59
CHAP TE R 6
Y is said to be functionally dependent on X if for any pair of tuples, two different values
of Y do NOT correspond to the same values of X.
Notation: X -> Y
Example:
Student(matNr, sName)
Two different sNames do NOT correspond to the same matNr. So sName is functionally
dependent on matNr.
Notation X=> Y
Example:
60
Our familiar university example:
The set {classNr, room} on the left can be further reduced to the set {classNr}. Two
different professors do NOT correspond to the same classNr. room can be ignored. So
pname is fully functionally dependent on classNr.
Similarly:
6.3 Normalization
Codd introduced a number of normal forms. They are principles that can hold for a
relation or not. Relations can be transformed in order to normalize them. We will be
talking of first normal form (1NF), second normal form (2NF), third normal form (3NF)
and Boyce Codd Normal Form (BCNF).
Definition:
A relation is in first normal form if it contains only simple, atomic values for attributes,
no sets. In other words, attributes should not have subattributes.
Example:
61
Relation Person:
offspring
Name Place
child age
Christa 12
James Peter 10 Sweden
Iris 9
Schmidt Martin 17 Germany
Rainer 18
First attempt:
Person:
pName Place
James Sweden
Scmidt Germany
Child:
Advantage:
This requires just the right amount of space that is needed and it is in 1NF.
Disadvantage:
62
6.3.2 Second Normal Form
Definition:
A relation is in 2NF if it is in 1NF and every non-primary key attribute is fully
functionally dependent on the primary key of the relation.
Example:
Our University Database:
TA(matNr, classNr, sName, hours, tasalary)
{matNr, classNr}=>{hours}
{matNr, classNr} => {tasalary}
{matNr} => {sName}
TA is not in 2NF because sName is not fully functionally dependent on the primary key
(matNr, classNr) but is fully functionally dependent on part of the primary key (matNr).
Solution:
We have to split the original relation to make it 2NF i.e. Move the dependency {matNr}
=> {sName} to a separate relation=> relation Student
Z is transitively dependent on X if
a) A chain exists: X => Y => Z
b) Y is not a super key
c) Z is not part of primary key
63
In other words, there should not be dependencies between non-key attributes.
Example:
Functional dependencies:
{matNr, classNr} => {hours}
{matNr, classNr} => {tasalary}
Assumption:
{hours} => {tasalary}
Solution:
Definition:
A relation is in BCNF if part of a primary key is not fully functionally dependent on any
non-primary key attribute.
Example:
64
Here the relation is not in BCNF because of the dependency: {postcode} => {town}
Speedlimit:
Codes
Postcode town
1217 Dhaka
1000 Dhaka
65
Now if we join the split tables, each streetSegment has all the postcodes in a town. So the
decomposition is lossy as shown:
Speedlimit
Postcodes
streetSegment Postcode
A-str 1217
B-str 1217
C-str 1217
D-str 1000
66
Third attempt:
Here both the relations are in BCNF. The decompositio n is lossless and there is less
repetition of values.
Speedlimit
postcode streetSegment speed
1217 A-str 30
1217 B-str 30
1217 C-str 50
1000 D-str 70
Codes
postcode town
1217 Dhaka
1000 Dhaka
It is possible to show:
A relation that is not in BCNF can always be losslessly
decomposed towards BCNF.
A lossless decomposition into BCNF that preserves all
dependencies does not always exist.
Advantages of normalization:
Many unnecessary redundancies are avoided.
Anomalies with input, deletion and updates can be avoided
Fully normalized relations tend to need less space than if not
normalized
Disadvantages of normalization:
67
6.5 Quality Criteria for Relational Design
68
C HA P T E R 7
QUERY PROCESSING
Query Processing refers to the range of activities involved in extracting data from a
database.
The first step in query processing is that the system must translate a given query (SQL
format) in its internal form, that is, convert it to a relational algebra expression. In
generating the internal form of the query, the parser checks the syntax of the users query,
verifies that the relation names appearing in the query are names of relations in the
database and so on. A parse-tree representation of the query is constructed, which is then
translated into a relational algebra expression.
69
A relational-query language is either declarative or algebraic. Declarative languages
permit users to specify what a query should generate without saying how the system
should do the generating (e.g, SQL). Algebraic languages allow for algebraic
transformations of users queries (e.g. Relational algebra).
The algebraic basis provided by a relational model helps in query optimization. Query
optimization is the process of selecting the most efficient query evaluation plan for a
query, which means a method for evaluating the efficient performance cost of a query i.e,
a method showing the minimum time it takes to execute the query. An SQL query can be
translated into a relational algebra expression in one of several ways:
This SQL expression can be translated to a relational algebra e xpression in one of two
ways:
balance< 2500 (balance(account))
balance ( balance<2500(account))
To implement the 2nd way, we can search every tuple in account to find tuples with
balance less than 2500. If an index (sorting the values of an attribute value in a database
table) is available on the attribute balance, we can use the index instead.
Query optimization involves the selection of instructions for processing and evaluating
each operation of a query such as choosing an algorithm to use for executing an
operation, choosing the specific indices to use and so on. A relational algebra operation
annotated with instructions on how to evaluate it is called an evaluation primitive.
Several primitives may be grouped together into a pipeline in which several operations
are performed in parallel. A sequence of primitive evaluations that can be used to
evaluate such a query is a query-evaluation plan or a query execution plan.
The above figure illustrates an evaluation plan for our example query in which a
particular index 1 (on the attribute balance) is specified for the selection operation. The
different evaluation plans for a given query can have different costs. It is the
responsibility of the system to evaluate a query evaluation plan that minimizes the cost of
query evaluation. The most relevant performance measure is minimizing the number of
disk accesses. Optimizers make use of statistical information about the relations such as
relation sizes and index depths to make a good estimate of the cost of a plan. In the above
figure, the evaluation plan in which the selection is done using the index is likely to have
the lowest cost and thus, to be chosen. The query evaluation engine takes a query
70
evaluation plan, evaluates the query with that plan and executes it and returns the answers
to the query.
Optimizer
Query Evaluation
Execution Plan
Output Engine
Statistics about
data
Data
71
CHAP TE R8
FILE ORGANIZATION
As operating systems are efficient in managing file systems, there may arise the need to
map a database to a file organization. A file is organized logically as a sequence of
records. These records are mapped onto disk blocks.
Although blocks are of a fixed size determined by the physical properties of the disk and
by the operating system, record sizes vary. In a relational system, tuples of distinct
relations are generally of different sizes.
One approach to mapping the database to files is to use several files and to store records
of only one fixed length in any given file. An alternative is to structure our files such that
we can accommodate multiple lengths for records.
Let us consider a file of account records for our bank database. Each record of this file is
defined as follows:
If we assume that each character occupies 1 byte and that a real occupies 8 bytes, our
account record is 40 bytes long. A simple approach is to use the first 40 bytes for the first
record, the next 40 bytes for the second record and so on. There are two problems with
this approach:
When a record is deleted, we could move the record that came after it into the space
formerly occupied by the deleted record and so on, until every record following the
deleted record has been moved ahead. Such an approach requires moving a large number
of records.
72
A good approach for the above problem is to allocate a certain number of bytes as a file
header at the beginning of the file. The header will contain a variety of information about
the file. In this header we can store the address of the first record whose contents are
deleted. The header thus points to the first available record and the first availab le record
will point to the second available record and so on. The deleted records thus form a
linked list which is often referred to as a free list. On insertion of a new record, we use
the record pointed to by the header. We change the header to point to the next available
record. If no free space is available, we add the new record to the end of the file.
header
record 0 Kakrail A-102 400
record 1
record 2 Dhanmondi A-215 700
record 3 Motijheel A-101 500
record 4
record 5 Rampura A-201 900
record 6
record 7 Bashundhara A-206 600
Here account_info is an array with the information account_number and balance stored in
indices 1 and 2. This information may be repeated with other values respectively in
indices 3 and 4 and so on for each branch_name. So there is no limit on how large a
record of this array can grow up to, of course, the size of the disk!
73
8.3 Byte-String Representation
A simple method for implementing variable- length records is to attach a special end-of-
record symbol to the end of each record. We can then store each record as a string of
consecutive bytes.
Thus, the basic-byte representation technique is not used for implementing variable-
length records. However a modified form of the byte string representation called the
slotted page structure is commonly used for implementing variable-length records.
The actual records are allocated contiguously in the block, starting from the end of the
block. The free space in the block is contiguous between the final entry in the header
array and the first record.
If a record is inserted, space is allocated for it at the end of free space and an entry
containing its size and location is added to the header.
If a record is deleted, the space that it occupies is freed and its entry is set to deleted (its
size is set to -1, for example). Further, the records in the block before the deleted record
74
are moved so that the free space created by the deletion is occupied and all free space is
again between the final entry in the header array and the first record.
The end-of- free space pointer in the header is appropriately updated as well. Records can
be grown or shrunk using similar techniques as long there is space in the block
75
8.5 Types of Record Organizations
So far we have discussed record length types. Now we will talk about how records are
organized within a file. There may be one relation or groups of relations within a file. We
will discuss this shortly.
In this organization there is one relation per file and the order of records within that
file does not matter.
In the above file organization, there is still one relation per file but the order of records
within the file does matter and this is maintained in a sorted order based on the value of a
search key. We will discuss around this topic only for the rest of the chapter.
In sequential file organization, records follow order based on the value of a search key
which maybe the value of an attribute or a set of attributes. The search key order of the
records is maintained using pointers. An example of this is shown below from the
banking enterprise organization.
76
When insertions and deltions are done on the relation, reorganizations of the records
based on the search key beome necessary which may be costly. A record can be inserted
in the following way:
i) We need to find the record before in the file that comes before
the record to be inserted in search key order.
ii) If the record is free, we insert the new record there. Otherwise
we insert the new record in an overflow block.
iii) In either case, the records need to be chained together by
pointers in search key order.
If relatively a few records need to be inserted in overflow blocks, this approach works
well otherwise the method of sequential approach may not be efficient.
As we mentioned earlier, reorganizations of file are costly and so must be carried out
when the system load is low. In the extreme cases, when insertions rarely occur,
reorganizations are minimal and maintaining sequential ordering of records is easy and
what more, the pointer field may be eliminated.
77
8.7 Multitable Clustering File Organization
In sequential file organization, one file is stored per relation as we have seen. This is
suitable for low-cost database systems and can take full advantage of the file system that
the operating system provides.
However many large-scale database systems do not rely directly on the underlying
operating system for file management. Instead one large operating-system is allocated to
the database system. The database system stores all relations in this file and ma nages the
file itself.
This query computes a join of deposit and customer relations. In the worst case, each
record will reside on a different block, forcing us to do one block read required by the
query. Deposit and Customer relations as well as deposit customer join relation are
shown below:
customer_name account_number
Kamal A-105
Kamal A-220
Kamal A-300
Shammi A-301
78
The file structure for the join mixes together tuples of two relations but allows for
efficient processing of the join. When a tuple of the customer relation is read, the entire
block containing that tuple is copied from disk into main memory. Since the
corresponding deposit tuples are stored on the disk near the customer tuple, the block
containing the customer tuple contains tuples of the deposit relation needed for efficient
processing of the query. Thus a multitable clustering file organization stores records of
two or more relations in each block allowing us to read records that would satisfy the join
by using one block read and process the query efficiently.
Under the above scheme, processing of some queries may become slow. For example, if
we were to find all customer records, each record would be located in a distinct block. So
in order to locate all tuples of the customer relation, some additional structure such as
chaining all the records of the relation using pointers may be used as shown below.
Careful use of multitable clustering produces significant performance gain in query
processing.
79
CHAP TE R9
DATA-DICTIONARY STORAGE
A relational database system needs to maintain data about the relations, such as the
schema of the relations. This information is called the data dictionary, or system cata log.
Among the types of information the system must store are these:
In addition, many systems keep the following data on users of the system:
There is also a need to store information about each index on each of the relations:
80
C H A P T E R 10
INDEXING
However a sorted list of account numbers would not work well on very large databases
with millions of records as the index would become very big. This gives rise to the need
for more sophisticated indexing techniques which we discuss below.
No one technique for ordered indices is the best for database applications. These
techniques must be evaluated on the basis of the following factors:
81
10.2 Orde red Indices
We can use an index structure in order to gain fast retrieval of records in a file. Each
index structure is associated with a particular search key. The records in the indexed file
are sorted. A file may have several indices, on different search keys. If the file containing
the records is sequentially ordered, a clustering index is an index whose search key also
defines the sequential order of the file. Clustering indices are also called primary indices.
Indices whose search key defines an order different from the sequential order of the file
are called nonclustering indices or secondary indices.
The following figure shows a sequential file of account records taken from our banking
example. In this figure, the records are stored in search key order, with branch name used
as the search key.
Dhanmondi
Katabon A-217 Dhanmondi 750
Rampura A-101 Kakrail 500
A-110 Kakrail 600
A-215 Katabon 700
A-102 Motijheel 400
A-201 Motijheel 900
A-218 Motijheel 700
A-222 Rampura 700
A-305 Shantinagar 350
An index record or index entry consists of a search-key value and pointers to one or more
records with that value as their search key value. The pointer to a record consists of the
identifier of a disk block and an offset within the disk block to identify within the block.
82
search key value and a pointer to the first data record with that search key
value.
The following figure shows dense index for the account file. Suppose we are looking
records for Motijheel branch.
Using the dense index, we follow the pointer in that record to locate the next record in
search key (branch_name) order. We continue processing records until we encounter
a record for a branch other than Motijheel.
In the next figure showing sparse index, we do not find an index entry for Motijheel.
Since the last entry in alphabetic order before Motijheel is Katabon, we follow the
pointer and then read the account file in sequential order until we find the first
Motijheel record and begin processing at that point.
Dhanmondi
Kakrail
Katabon
Motijheel
A-217 Dhanmondi 750
Rampura
A-101 Kakrail 500
Shantinagar A-110 Kakrail 600
A-215 Katabon 700
A-102 Motijheel 400
A-201 Motijheel 900
A-218 Motijheel 700
A-222 Rampura 700
A-305 Shantinagar 350
As we have seen, it is generally faster to locate a record if we have a dense index rather
than a sparse index. However, sparse indices have advantages over dense indices in that
they require less space and they impose less maintenance overhead for insertions and
deletions.
83
Dhanmondi
Katabon
A-217 Dhanmondi 750
Rampura A-101 Kakrail 500
A-110 Kakrail 600
A-215 Katabon 700
A-102 Motijheel 400
A-201 Motijheel 900
A-218 Motijheel 700
A-222 Rampura 700
A-305 Shantinagar 350
Even if we use a sparse index, it may become too large, and the process of searching a
large index may be inefficient and costly. To deal with this problem, we construct a
sparse index on the clustering index as shown below. To locate a record, we first use
binary search on the outer index to find the record for the largest search key value less
than or equal to the one we desire. The pointer points to a block of the inner index. We
scan this block until we find the record that has largest search key value less than or equal
to the one we desire. The pointer in this record points to the block of the file that contains
the desired record.
Using the two levels of indexing, we have read only one index block, rather than the
seven we read with binary search, provided that the outer index is already in main
memory.
If our file is very large, we can use multilevel indices i.e, indices with two or more levels.
Searching for records with a multilevel index requires significantly fewer I/O
(input/output) operations than does searching by binary search.
84
Index block 0
.
.
. Data block 0
. Data block 1
.
.
.
.
. .
Outer index .
.
Index block 1
Inner index
Regardless of what form of index is used, every index must be updated whenever a
record is either inserted into or deleted from the file. We now describe processes for
updating single level indices.
85
o Dense indices:
o Dense indices:
86
updates the index record with the search key
value to point to the next record.
o Sparse indices:
A secondary index on a candidate key looks just like a dense clustering index except that
the records pointed to by successive values in the index are not stored sequentially. In
general, however, secondary indices may have a different structure from clustering
indices. If the search key of a clustering index is not a candidate key, it is still ok if the
index points to the first record with a particular value for the search key, since the other
records can be fetched by a sequential scan of the file.
In contrast, if the search key of a secondary index is not a candidate key, it is no t enough
to point to just the first record with each search key value. The remaining records with
the same search key value could be anywhere in the file, since the records are ordered by
the search key of the clustering index, rather than by the search ke y of the secondary
index. Therefore, a secondary index must contain pointers to all the records.
We can use an extra level of indirection to implement secondary indices on search keys
that are not candidate keys. The pointers in such a secondary index do not point directly
to the file. Instead, each points to a bucket that contains pointers to the file. The following
figure shows the structure of a secondary index that uses an extra level of indirection on
the account file, on the search key balance.
87
350
400
500
600 A-217 Dhanmondi 750
700 A-101 Kakrail 500
750 A-110 Kakrail 600
A-215 Katabon 700
900 A-102 Motijheel 400
A-201 Motijheel 900
A-218 Motijheel 700
A-222 Rampura 700
A-305 Shantinagar 350
The B+- tree index structure is the most widely used of several index structures that
maintain their efficiency despite insertion and deletion of data. A B+- tree index takes the
form of a balanced tree in which every path from the root of the tree to a leaf of the tree is
of the same length. The B in B+-tree stands for balanced. Each nonleaf node in the tree
has between (ceiling of n/2) and n children, where n is fixed for a particular tree.
P1 K1 P2 Pn-1 Kn-1 Pn
88
10.3.1 Structure of a B+-Tree
A B+-Tree index is a multilevel index, but it has a structure that differs from that of the
multilevel index-sequential file. The following figure shows a typical node of a B+-Tree. It
contains up to n-1 search key values K1, K2,..Kn-1 and n pointers P1, P2, Pn. The
search-key values within a node are kept in sorted order; thus, if i<j, Ki<Kj.
Dhanmondi Motijheel
We consider first the structure of the leaf nodes. For i=1, 2, ., n-1, pointer Pi points to
either a file record with search key value Ki or to a bucket of pointers, each of which points to
a file record with search key value Ki. The bucket structure is used only if the search key does
not form a candidate key, and if the file is not sorted in the search-key value order.
The above figure shows one leaf node of a B+-Tree for the account file, in which we have
chosen n to be 3, and the search key is branch_name. Note that, s ince the account file is
ordered by branch_name, the pointers in the leaf node point directly to the file.
Now that we have seen the structure of a leaf node, let us consider how search-key values are
assigned to particular nodes. Each leaf can hold up to n-1 values. We allow leaf nodes to
contain at least values. The ranges of values in each leaf do not overlap. Thus if Li and
Lj are leaf nodes and i < j, then every search key value in Li is less than every search key
value in Lj. If the B+-Tree index is to be a dense index, every search-key value must appear in
some leaf node.
Now we can explain the use of the pointer Pn. Since there is a linear order on the leaves based
on the search key values that they contain, we use Pn to chain together the leaf nodes in search
key order. This ordering allows for efficient sequential processing of the file.
The nonleaf nodes of the B+-Tree form a multilevel (sparse) index on the leaf nodes. The
structure of nonleaf nodes is the same as that for leaf nodes, except that all pointers are
pointers to tree nodes. A nonleaf node may hold up to n pointers, and must hold at least n/ 2
pointers. The number of pointers in a node is called the fanout of the node.
89
Let us consider a node containing m pointers. For i=2,3..m-1, pointer Pi points to the
subtree that contains search key values less than Ki and greater than or equal to Ki-1. Pointer
Pm points to the part of the subtree that contains those key values greater than or equal to Km-
1, and pointer P1 points to the part of subtree that contains those search key values less than
K1.
Unlike other nonleaf nodes, the root node can hold fewer than pointers; however, it must
hold at least two pointers, unless the tree consists of only one node. It is always possible
construct a B+-Tree, for any n, that satisfies the preceding requirements. The fo llowing figure
shows a complete B+-Tree for the account file (n=3). For simplicity, we have omitted both the
pointers to the file itself and the null pointers.
Motijheel
Katabon Rampura
It may be mentioned that the root of above B+ tree structure corresponds to the branch name
with the highest balance. The branch names within a leaf or tree node follow sorted
(alphabetical) order.
90
C H A P T E R 11
TRANSACTIONS
A transaction is a sequence of data access operations that transfers the database from one
consistent state to another consistent state (where the new state is not necessarily
different from the old).
Examples of transactions:
Compile a list of all students that take the class on CSE 303.
Set the grade 3.0 in the class on CSE 304 for the student with
matNr 003
Give all TAs a 10% pay rise.
Transaction processing systems are systems with large databases and a large number of
users who execute transactions in parallel. Examples include travel reservation systems
(hostels, flights), credit card processing, stock markets and supermarket checkout.
A transaction can terminate successfully or reach some state where it is discovered that
an error has occurred. In the first case, the transaction is said to be committed; in the
second case it is aborted. An aborted transaction must be undone (any changes it has
performed must be reversed) to go back to the previous consistent state. The undo is also
known as a roll back.
BEGIN_TRANSACTION
Instructions
if everything ok then COMMIT
else ABORT
END_OF_TRANSACTION
Some systems do not provide a begin_transaction statement. Instead, the first SQL
command that is executed after the end of a transaction automatically starts a new
transaction.
91
Atomicity: Either all operations of a transaction are executed
or none at all => If the transaction is interrupted by some error, all intermediate
changes must be reversed.
Consistency: A transaction must transform the database from a
consistent state to another consistent state.
Isolation: Only after successful transaction, partial results of a
transaction may be released for usage by other transactions => Transactions act
independently of each other.
Durability: After the successful termination of a transaction, its
results are persistent i.e., they can only be changed or undone by a new transaction,
even if errors or failures occur.
Note: If it is discovered after a commit that the effects of a transaction must be undone, a
compensating transaction is necessary to do this.
92
C H A P T E R 12
RECOVERY IN TRANSACTIONS
Goal:
Preserve correctness and consistency of data over time, allowing for parallel access
(transactions) of multiple users and occurring errors.
Software error:
in an application program
in the operating system
in the database system
Hardware error:
Ope rator error (e.g. mounted wrong tape, wrong file system)
Database Sabotage
Recovery is supposed to restore a state of the database that is known to be correct, after
an occurred or presumed error.
Reading and writing data items from and to a database are not immediate, atomic
operations but actions consisting of several steps:
Reading a data item from the database requires the following steps:
1. Find the address of the disk block that contains the data item.
2. Copy this disk block into a buffer in memory (if it is not there
93
already).
3. Copy the value of the data item into a variable in the
application program.
1. Find the address of the disk block that contains the old version
of the data item or the address of the disk block where the data item is newly
entered.
2. Copy this disk block into a buffer in memory (if it is not there
already)
3. Copy the value of the data item from an application variable
into the correct location in the block in the buffer.
4. Copy the updated block from the buffer back to disk. This may
be done immediately or later.
Log Files
Changes to the data are written to a log file to record what has been done by transactions.
This will be used during recovery.
Recovery Approaches
A log file is used to record the changes in the database at each update, typically
the old and new values of the updated attribute(s).
94
2. When an error occurs:
a) If the database itself is damaged (media failure, e.g. by a head
crash) =>
-The last backup is loaded
-The updates of all committed transactions that were executed since the time
of last backup are shown by the log (redo). Those transactions that had not
committed yet, need to be restarted.
95
C H A P T E R 13
CONCURRENCY CONTROL
If two or more transactions using the same data items are executed in parallel, problems
may occur:
Example 1:
Two people own two ATM cards for the same bank account. Both people withdraw
money from the account at the same time at two different teller machines.
A writes Tk800 to the database as the new B writes Tk600 to the database as the
balance new balance
Evaluation
96
Example 2
Start transaction A
A reads a copy of the account balance:
copy1 = Tk1000
A withdraws Tk200
copy1 := 1000 - 200 = Tk800
The problems in the examples can occur if the isolation property of transactions is
violated.
13.2 Serializability
If transactions are programmed correctly, they start on a consistent state and leave the
database in a consistent state.
Definitions:
97
A schedule is called serializable if its effects are the same as the effect of some serial
schedule.
The above could also be phrased as: Two transactions are called serializable if every
parallel execution delivers the same result as some serial execution.
For a set of transactions, there may be several serializable schedules that have different
outcomes, depending on the order of the operations. This is okay, since we consider here
transactions that are executed in parallel, meaning that we do not care whic h one comes
first in the end.
Example 1:
Two transactions reserving a room in the same hotel run in parallel. Both transactions
read the list of available rooms. If the system ensures serializable schedules, data
consistency is guaranteed; however, it is unknown which transaction will get which
room. The first to access the list of rooms will probably reserve the next room on the list.
In this example, it probably does not matter which client gets which room; the system
decides who comes first.
Example 2:
Two transactions both access a bank account, one makes a deposit, the other withdraws
some cash from it. If the bank does not care how high or low the account balance is, this
can be run concurrently. However, if the customer is at the lowest borderline of the
allowed negative balance, the cash withdrawal may not be granted by the bank before the
deposit is made. Therefore, these transactions should not be run in parallel, but the
withdrawal transaction should be started after the deposit transaction has completed.
98
13.3 Some Synchronization Protocols
Only serial transactions are executed in the system. This is usually a waste of capacity
and user time. However, serial executions of transactions is always correct, since every
transaction leaves the database in a consistent state.
Begin_Transaction -> read and update phase -> validation phase -> write phase -> End of
Transaction
The above protocol is good for situations with a few conflicts. If there are too many
conflicts, the overhead is immense. Many transactions have to be restarted. The more
transactions are restarted, the higher the number of conflicts; hence even more
transactions are restarted and hence, even more conflicts happen and the cycle continues.
Every transaction receives a timestamp when it is started. Every data item receives a read
and write timestamps that mark most recent read and write accesses respectively. A
transaction may execute an operation on a data item if the timestamps indicate
serializability. Otherwise the transacition is aborted and restarted. An upside to this
process is that no central data structure is necessary and distributed databases are good
enough. A downside is not all serializable transactions are recognized and there may be
redundant restarts.
This is the most popular synchronization protocol. Locking is normally hidden from
application programmers and performed by a resource manager in the system. One of its
important roles is setting and releasing locks
99
The DBMS manager manages a central table recording every data item currently in use
by a transaction. It stores locking modes and a list of lock requests corresponding to data
items.
Locking modes:
exclusive lock (write lock, X lock) for changing an item
shared lock (read lock, S lock) for reading an item
Locking tables are held in main memory. This is because in a system with many
transactions, lock and unlock operations are frequent and therefore must operate speedily.
100
Phase Locking Protocol (2PL)
Locking single items only makes the access to this one item consistent. Generally,
transactions consist of more than one data access. In order to ensure the serializability of
schedules, the two-phase locking protocol is used. This method is based on the
"fundamental locking theorem" in databases. It consists of five conditions in combination
necessary for the serializability of the schedules.
1. Every data item that a transaction needs to access is locked before the access.
2. A transaction does not request a lock that it owns already.
3. A transaction must respect the locks of other transactions (according to a compatibility
table).
4. Every transaction goes through two phases:
- a growth phase in which it requests locks but must not release any locks, and
- a shrink phase in which it releases its acquired locks, but may not request any
new locks.
5. A transaction must release all of its locks before End_of_Transaction.
101
C H A P T E R 14
ADVANCED DATABASES
As part of advanced databases, I include here distributed databases, data warehouses and
multimedia databases, and introduce data mining and NoSQL.
Advantages:
102
The client machines might have stored data of their own, and the server might
have applications of its own. Therefore, each machine will act as a server for
some users and a client for others. A single client machine might be able to
access several different machines. Such access can basically be achieved in
the following two different ways:
End Users
Applications
DBMS
Database
103
..
Client
machines
Communication Network
Server
machine
104
Clients
Server
Communication
Network
Clients Clients
Server Server
105
14.2 Data Warehouses
Explanation:
The purpose of most databases is to present current, not historical data. Data in
traditional databases is not always associated with a time whereas data in a DW
always is.
Advantages:
106
14.2.1 Application Areas of Data Warehouses
Multimedia databases store multimedia such as images, audio and video. The database
functionality becomes important when the number of multimedia objects stored is large.
107
14.4 Data Mining
Definition:
2) Bio Science
Changes in gene expression
4) Marketing
i) Customer Buying Patterns
ii) Market Basket Analysis
The key deciding factors why NoSQL databases have become the first alternative to
relational databases are scalability, availability and fault tolerance which I will go into
details later on. They are in fact schema- less data models with horizontal scalability,
distributed architecture and usage of languages and interfaces.
108
importance of data is summed up as if your data is not growing, then neither is your
business.
There are four general types of NoSQL databases, each with their own specific attributes:
Key-Value store: In this type of database, we use some of the least complex
NoSQL options. These databases are designed for storing data in a schema- less
way. In a key- value store, all of the data consists of an indexed key and a value.
Column store : (aka as wide-column stores) These databases are designed for
storing data tables as sections of column of data, rather than as rows of data.
Wide-column stores offer a very high performance and high scalable structure.
Graph Database: Based on graph theory, these databases are designed with data
whose relations are well represented as a graph with interconnecting elements in
them.
4. Data complexity:- data stores and centers in different locations storing and
managing data.
109
Continuous Data Availability
Hardware failures are apt to occur in todays business. Fortunately, NoSQL databases
have distributed architectures so that if one or more database servers or nodes go down,
the other nodes of the system will continue to operate, exhibiting fault tolerance.
However, when deployed in an appropriate way, NoSQL databases can provide high
performance at a massive scale without going down and hence without losing real dollars.
Better Architecture
A NoSQL database strives to provide a more suitable architecture for a particular
application in many cases. Modern NoSQL databases not only provide storage and
management of application data that cater for an instant understanding of complex data
and facilitate flexible data mining, analysis and decision- making.
Workload diversity: Big data comes in all shapes, colors, and sizes which
requires a flexible design unlike rigid schemas. This preferred design will perform
transactions real fast, run analytics just as fast and identify any data from volumes
of data and much more.
Scalability: With big data comes the need for scaling very rapidly and elastically
across multiple data centers and also even across clouds.
110
Continuous Availability: This consideration of Big Data is not always high
enough, considering high performance on the other hand, because it may be
difficult to maintain all the time that data can never go down and hence, there can
be no single point of failure in a NoSQL environment.
Cost: Deploying NoSQL properly allows for all of the benefits above including
most of all, lowering operational costs.
111
MISCELLANEOUS DATABASE PROJECT
112
About the Author
Rosina S Khan, the author of this book is a former senior
faculty member of a private university in Dhaka city. She
has taught the undergraduate course Databases and guided
group and individual experiments in Database labs for
quite a many semesters for years in the university.
Furthermore, she has managed and supervised serveral 4 th
year undergraduate theses and projects in the field of
Databases. Her own masters thesis done as a part of MSc
degree in Software Technology in Germany was also based
on Databases.
113
This book was distributed courtesy of:
For your own Unlimited Reading and FREE eBooks today, visit:
http://www.Free-eBooks.net
Share this eBook with anyone and everyone automatically by selecting any of the
options below:
COPYRIGHT INFORMATION
Free-eBooks.net respects the intellectual property of others. When a book's copyright owner submits their work to Free-eBooks.net, they are granting us permission to distribute such material. Unless
otherwise stated in this book, this permission is not passed onto others. As such, redistributing this book without the copyright owner's permission can constitute copyright infringement. If you
believe that your work has been used in a manner that constitutes copyright infringement, please follow our Notice and Procedure for Making Claims of Copyright Infringement as seen in our Terms
of Service here:
http://www.free-ebooks.net/tos.html