normalaization ppt 3nf
normalaization ppt 3nf
What is Normalization ?
NORMALIZATION is a database design technique that organizes tables in a manner that reduces
redundancy and dependency of data.
Normalization divides larger tables into smaller tables and links them using relationships.
The purpose of Normalization is to eliminate redundant (useless) data and ensure data is stored
logically.
The inventor of the relational model E.F.Codd proposed the theory of normalization.
2
Redundancy
Row Level Redundancy: If the SID is primary key to each row, you can
use it to remove the duplicates as shown
below:
1 Jojo 20
3
Redundancy (Cont..)
Redundant
Sid Sname Cid Cname Fid Fname Salary Column
Values
1 AA C1 DBMS F1 Jojo 30000
2 BB C2 JAVA F2 KK 50000
4
What is an Anomaly?
Problems that can occur in poorly planned, unnormalized databases where all the data is stored in
one table (a flat-file database).
Types of Anomalies:
• Insert
• Delete
• Update
5
Anomalies in DBMS
Insert Anomaly : An Insert Anomaly occurs when certain attributes cannot be inserted into the
database without the presence of other attributes.
Delete Anomaly: A Delete Anomaly exists when certain attributes are lost because of the deletion
of other attributes.
Update Anomaly: An Update Anomaly exists when one or more instances of duplicated data is
updated, but not all.
6
Anomaly Example
Below table University consists of seven attributes: Sid, Sname, Cid, Cname, Fid,
Fname, and Salary. And the Sid acts as a key attribute or a primary key in the relation.
7
Insertion Anomaly
Suppose a new faculty joins the University, and the Database Administrator inserts the faculty data
into the above table. But he is not able to insert because Sid is a primary key, and can’t be NULL.
So this type of anomaly is known as an insertion anomaly.
8
Delete Anomaly
When the Database Administrator wants to delete the student details of Sid=2 from the above table,
then it will delete the faculty and course information too which cannot be recovered further.
SQL:
DELETE FROM University WHERE Sid=2;
9
Update Anomaly
When the Database Administrator wants to change the salary of faculty F1 from 30000 to 40000 in
above table University, then the database will update salary in more than one row due to data
redundancy. So, this is an update anomaly in a table.
SQL:
UPDATE University
SET Salary= 40000
WHERE Fid=“F1”;
10
Normal forms
The Theory of Data Normalization in SQL is still being developed further. For example, there are
discussions even on 6th Normal Form. However, in most practical applications, normalization
achieves its best in 3rd Normal Form. The evolution of Normalization theories is illustrated
below-
11
First Normal Form (1NF)
According to the E.F. Codd, a relation will be in 1NF, if each cell of a relation contains only an
atomic value.
12
1NF Example
Example:
The following Course_Content relation is not in 1NF because the Content attribute contains
multiple values.
13
1NF Example (Cont..)
14
Rules of 1NF
Additional:
Choose a primary key.
Reminder:
A primary key is unique, not null, unchanged. A primary key can be either an attribute or combined
attributes.
15
Second Normal Form (2NF)
According to the E.F. Codd, a relation is in 2NF, if it satisfies the following conditions:
16
Prime and Non Prime Attributes
Prime attributes: The attributes which are used to form a candidate key are called prime attributes.
Non-Prime attributes: The attributes which do not form a candidate key are called non-prime
attributes.
17
Functional Dependency
A dependency FD: X → Y means that the values of Y are determined by the values of X. Two
tuples sharing the same values of X will necessarily have the same values of Y.
We illustrate this as:
X Y (read as: X determines Y or Y depends on X)
18
Functional Dependency
Whenever two rows in this table feature the same StudentID, they also necessarily have the same
Semester values. This basic fact can be expressed by a functional dependency:
StudentID → Semester.
19
Partial Dependency
If a non-prime attribute can be determined by the part of the candidate key in a relation, it is known
as a partial dependency.
20
2NF Example
In Student_Project relation that the prime key attributes are Stu_ID and Proj_ID.
According to the rule, non-key attributes, i.e. Stu_Name and Proj_Name must be dependent upon both
and not on any of the prime key attribute individually.
But we find that Stu_Name can be identified by Stu_ID and Proj_Name can be identified by Proj_ID
independently. This is called partial dependency, which is not allowed in Second Normal Form.
21
2NF Example (Cont..)
We broke the relation in two as depicted in the above picture. So there exists no partial
dependency.
22
Example 2NF
The Course Name depends on only CourseID, a part of the primary key
not the whole primary {CourseID, SemesterID}.It’s called partial dependency.
Solution:
Remove CourseID and Course Name together to create a new table.
23
CourseID SemesterID Num Student
Example 2NF (Cont..) IT101 201301 25
IT101 201302 25
IT102 201301 30
IT102 201302 35
IT103 201401 20
According to the E.F. Codd, a relation is in third normal form (3NF) if it satisfies the following
conditions:
25
Transitive Dependency
X -> Z is a transitive dependency if the following three functional dependencies hold true:
X->Y
Y->Z
26
Transitive Dependency(Cont..)
{Book} ->{Author} (if we know the book, we knows the author name)
27
3NF Example
We find that in the above Student_detail relation, Stu_ID is the key and only prime key attribute.
We find that City can be identified by Stu_ID as well as Zip itself.
Neither Zip is a superkey nor is City a prime attribute. Additionally, Stu_ID → Zip → City, so
there exists transitive dependency.
28
3NF Example (Cont..)
To bring this relation into third normal form, we break the relation into two relations as follows −
29
Example 3NF
Solution:
Remove Teacher Name and Teacher Tel together The Teacher Tel is a nonkey attribute, and
to create a new table. the Teacher Name is also a nonkey atttribute.
But Teacher Tel depends on Teacher Name.
It is called transitive dependency.
30
StudyID Course Name T.ID
Example 3NF 1 Database T1
2 Database T2
3 Web Prog T3
4 Web Prog T3
5 Networking T4
Done?
Oh no, it is still not
in 1NF yet.
Remove Repeating
row. ID Teacher Name Teacher Tel
Note about primary key: T1 Sok Piseth 012 123 456
- In theory, you can choose
T2 Sao Kanha 0977 322 111
Teacher Name to be a primary key.
- But in practice, you should add T3 Chan Veasna 012 412 333
Teacher ID as the primary key. T4 Pou Sambath 077 545 221
31
Example Table
StudentID is the primary key.
Is it 1NF?
How can you make it 1NF?
32
Example 1 (Cont..)
But now the studentID no longer uniquely identifies each row. You now need to declare studentID
and subject together to uniquely identify each row. So the new key is StudentID and Subject.
Is it 2NF?
33
Example 1 (Cont..)
Studentname and address are dependent on studentID (which is part of the key)
This is good. But they are not dependent on Subject (the other part of the key)
All non-key fields are dependent on the ENTIRE key (studentID + subject)
34
Example 1 (Cont..)
35
Example (Cont..)
But is it 3NF?
36
Example 1 (Cont..)
But either way, non-key fields are dependent on MORE THAN THE PRIMARY KEY (studentID).
And 3NF says that non-key fields must depend on nothing but the key
37
Example 1 (Cont..)
38
Example 1 (Cont..)
39
Example 2
We will use the Student_Grade_Report table below, from a School database, as our example to
explain the process for 1NF.
40
Process for 1NF
In the Student Grade Report table, the repeating group is the course information. A student can take
many courses.
Remove the repeating group. In this case, it’s the course information for each student.
Identify the PK for your new table.
The PK must uniquely identify the attribute value (StudentNo and CourseNo).
After removing all the attributes related to the course and student, you are left with the student course
table (StudentCourse).
The Student table (Student) is now in first normal form with the repeating group removed.
The two new tables are shown below:
41
Example 2 (Cont..)
42
Example 2 (Cont..)
43
Process for 3NF
Eliminate all dependent attributes in transitive relationship(s) from each of the tables that have a
transitive relationship.
Create new table(s) with removed dependency.
Check new table(s) as well as table(s) modified to make sure that each table has a determinant and
that no table contains inappropriate dependencies.
See the four new tables below.
44
Process for 3NF
45
Process for 3NF
46