Database Management System
BCAC 0020
Topic: Database Design
Presented by: Sanjiv Agrawal
Assistant Professor
Computer Engineering & Applications
Department,
GLA University, Mathura
[email protected], +91-
8923015116
Normalization
• Normalization is a database design technique that reduces data
redundancy and eliminates undesirable characteristics like Insertion,
Update and Deletion Anomalies. Normalization rules divides larger
tables into smaller tables and links them using relationships. The
purpose of Normalization is to eliminate redundant (repetitive) data
and ensure data is stored logically.
• If a database design is not perfect, it may contain anomalies, which are
like a bad dream for any database administrator. Managing a database
with anomalies is next to impossible.
• It is Defined as a step-by-step process of decomposing a complex
relation into a simple and stable data structure.
• It is a formal process that can be followed to achieve a good database
design.
• Also used to check that an existing design is of good quality.
• The different stages of normalization are known as “Normal Forms”.
• To accomplish normalization we need to understand the concept of
Functional Dependencies.
What are Anomalies
• Update anomalies − If data items are scattered and are not
linked to each other properly, then it could lead to strange
situations. For example, when we try to update one data item
having its copies scattered over several places, a few instances
get updated properly while a few others are left with old values.
Such instances leave the database in an inconsistent state.
• Deletion anomalies − We tried to delete a record, but parts of
it was left undeleted because of unawareness, the data is also
saved somewhere else.
• Insert anomalies − We tried to insert data in a record that does
not exist at all.
• Normalization is a method to remove all these anomalies and
bring the database to a consistent state.
• Data redundancy is a condition created within a database or
data storage technology in which the same piece of data is held
Use of Functional Dependencies
• Functional Dependency avoids data redundancy. Therefore same
data do not repeat at multiple locations in that database.
• It helps you to maintain the quality of data in the database.
• It helps you to defined meanings and constraints of databases.
• It helps you to identify bad designs.
• It helps you to find the facts regarding the database design.
Functional Dependencies Definition
Let R be a relation schema
R and R
The functional dependency
(determinant) (dependent)
holds on R if and only if for any legal relations r(R), whenever any two tuples t1 and
t2 of r agree on the attributes , they also agree on the attributes . That is,
t1[] = t2 [] t1[ ] = t2 [ ]
Example: Consider r(A,B ) with the following instance of r.
1 4
1 5
3 7
On this instance, B A hold; A B does NOT hold,
Functional Dependencies
Example 1: Example 2: Example 3:
X Y X Y RollNo->Name
1 1 1 1 for single class
2 1 2 1 Otherwise no
3 2 3 2
4 3 4 3
2 5 2 1
Given a set F (set of functional dependencies), there are certain
other functional dependencies that are logically implied by F.
If A B and B C, then we can infer that A C etc.
The set of all functional dependencies logically implied by F is the
closure of F.
We denote the closure of F by F+.
Armstrong’s axioms/properties of
functional dependencies:
1.Reflexivity: If Y is a subset of X, then X→Y holds by reflexivity rule.
i.e., X → X name is valid.
2.Augmentation: If X → Y is a valid dependency, then XZ → YZ is also valid by
the augmentation rule.
i.e., If {roll_no} → name is valid, hence {roll_no, marks} → {name, marks} is also
valid.
3.Transitivity: If X → Y and Y → Z are both valid dependencies, then X→Z is also
valid by the Transitivity rule.
i.e., roll_no → dept_name & dept_name → dept_building, then roll_no →
dept_building is also valid.
4.Union: If X → Y and X → Z are both valid dependencies, then X→YZ is also
valid by the Union rule. i.e. IF rollno →name, rollno →marks, then rollno →
name, marks
5.Decomposition/Splitting: If X →YZ is valid dependency, then X→Y & X → Y &
X → Z are also valid by the Union rule.
6.Psuedo Transitivity: If X →Y & YZ →A, then XZ →A
Types of Functional dependencies
1.Trivial functional dependency
2.Non-Trivial functional dependency
3.Multivalued functional dependency
4.Transitive functional dependency
Attribute Closure
• Closure of an attribute x is the set of all attributes that are
functional dependencies on X with respect to F. It is denoted by X+
which means what X can determine.
Q. Consider a relation R(A,B,C,D,E,F) where F: E->A, E->D, A->C, A->D,
AE->F, AG->K. Find the closure of E or E+. (See above example)
• Let the relation R(A,B,C,D,E,F) where F: B->C, BC->AD, D->E, CF->B.
Find the closure of B.
• B+ = {B,C,A,D,E}
Attribute Closure
Q. R(A,B,C,D,E) AND F: A->B,B->C, C->D, A->E. Find the closure of F.
Solution:
A+= {A,B,C,D,E}
B+= {B,C,D}
C+= {C,D}
F+= {A->A, A->B, A->C, A->D, A->E, B->B, B->C, B->D, C->C, C->D}
Attribute Closure
Q. Consider a relation R ( A , B , C , D , E , F , G ) with the functional
dependencies- A → BC, BC → DE, D → F, CF → G.
Find the closure of some attributes and attribute sets.
Solution:
A+ = { A }
= { A , B , C } ( Using A → BC )
= { A , B , C , D , E } ( Using BC → DE )
= { A , B , C , D , E , F } ( Using D → F )
= { A , B , C , D , E , F , G } ( Using CF → G )
Thus,
A+ = { A , B , C , D , E , F , G }
D+ = { D , F }, { B , C }+ = { B , C , D , E , F , G }
Candidate Keys and Super Keys
Candidate Key: is a minimal set of attributes of a relation which can be used to
identify a tuple uniquely.
Super Key: is a set of attributes of a relation which can be used to identify a tuple
uniquely.
A candidate key is always a super key but not vice versa.
The set of attributes whose attribute closure is set of all attributes of relation is
called super key of relation and if this is minimal is called candidate key.
Therefore wx, wz, xy, yz are the candidate
keys in this relation because the closure of
these have all the attributes of relation.
Candidate Keys and Super Keys
Q. R(A,B,C,D,E,F) WHERE F:A->BC, B->D, C->DE, BC->F. Then, find the
candidate keys of R.
Solution:
A+= {A,B,C,D,E,F}={R}=>A is a candidate key
B+= {B,D} => B is not a candidate key
C+= {C,D,E} => C is not a candidate key
BC+= {B,C,D,E,F} => BC is not a candidate key
Closure of F (F+): F+ is the set of all FDs that can be inferred/
derived from F. Using Armstrong Axioms repeatedly on F, we can
compute all the FDs.