F U-4 PDF
F U-4 PDF
A major aim of relational database design is to minimize data redundancy. The problems
associated with data redundancy are illustrated as follows:
Storing the same information in more than one place within a database is called redundancy and
can lead to several problems:
The key for Hourly_Emps is ssn. In addition, suppose that the hourly_wages attribute is
determined by the rating attribute. That is, for a given rating value, there is only one permissible
hourly_wages value. This IC is an example of a functional dependency. It leads to possible
redundancy in the relation Hourly_Emps, as shown below:
If the same value appears in the rating column of two tuples, the IC tells us that the same value
must appear in the hourly_wages column as well. This redundancy has the following problems:
Redundant Storage: The rating value 8 corresponds to the hourly_wage 10, and this
association is repeated three times.
Update Anomalies: The hourly_wage in the first tuple could be updated without making a
similar change in the second tuple.
Insertion Anomalies: We cannot insert a tuple for an employee unless we know the
hourly_wage for the employee’s rating value.
Deletion Anomalies: If we delete all tuples with a given rating value (e.g., we delete the
tuples for Ayush Soni and Rajasekhar) we lose the association between that rating value
and its hourly_wage value.
Null Values:
Null values cannot provide a complete solution, but they can provide some help.
Consider the example Hourly_Emps relation. Here null values cannot help to eliminate
redundant storage, update or deletion anomalies. It appears that they can address insertion
anomalies. For instance, we can insert an employee tuple with null values in the hourly wage
field. However, null values cannot address all insertion anomalies. Thus, null values do not
provide a general solution to the problems of redundancy, even though they can help in some
cases.
Decompositions:
Wages(rating, hourly_wages)
To answer a first question, several normal forms have been proposed for relations. If a relation
schema is in one of these normal forms, we know that certain kinds of problems cannot arise.
The lossless-join property enables us to recover any instance of the relation of the decomposed
relation from corresponding instances of the smaller relations.
Functional Dependencies:
Let R be a relation schema and let X and Y be nonempty sets of attributes in R. We say that an
instance r of R satisfies the FD X→Y1 if the following holds for every pair of tuples t1 and t2 in
r:
1
X→ Y is read as X functionally determines Y, or simply as X determines Y.
An FD X→Y says that if two tuples agree on the values in attributes X, they must also agree on
the values in attributes Y.
A B C D
a1 b1 c1 d1
a1 b1 c1 d2
a1 b2 c2 d1
a2 b1 c3 d1
Here, if we add a tuple <a1, b1, c2, d1> to the instance shown in figure, the resulting instance
would violate the FD.
Given a set of FDs over a relation schema R, typically several additional FDs hold over R
whenever all of the given FDs hold.
With given FDs ssn → did and did→ lot. Then, in any legal instance of Workers, if two tuples
have the same ssn value, they must have the same did value, and because they have the same did
value, they must also have the same lot value. Therefore, the FD ssn→ lot also holds on
Workers.
The set of all FDs implied by a given set F of FDs is called the closure of F, denoted as F+. The
closure F+ can be calculated by using the following Armstrong’s Axioms rules. Let X, Y, and
Z be the sets of attributes over a relation schema R:
Ex1:
Ex2:
i) C→ CSJDPQV.
ii) JP→C.
iii) SD→P.
Several additional FDs hold in the closure of the set of given FDs:
Note:
In a trivial FD, the right side contains only attributes that also appear on the left side.
Using reflexivity, we can generate all trivial dependencies, which are of the form:
Attribute Closure
If we want to check whether a given dependency, say, X→Y, is in the closure F+, we can do so
efficiently without computing F+. We first compute the attribute closure X+ with respect to F,
which is the set of attributes A such that X→A can be inferred using Armstrong Axioms. The
algorithm for computing the attribute closure of a set X of attributes is shown below:
closure = X;
(1) It deals with the 1-1 Relationship between attributes and rearly it will also talk about 1-M
(2) F.D must be defined on the scheme but not instances.
(3) F.D must be a Non-trivial.
(4) In trivial F.D RHS is a complete subset of LHS Eg: ABC BC
(5) In non-trivial F.D at least one of the RHS attributes is not a subset of LHS Eg: ABC
BD
(6) In a complete non-trivial F.D none of the RHS attributes are the subset of LHS Eg: ABC
DE
Once the F.D’s are identified from semantics then additional F.D’s can be derived from
the existing set.
Eg: F1 from semantics and F2 from F1 then total F.D’s= F1+ F2.The input for the
normalization process should be F1+ F2. F2 can be identified in the different ways.
(1) By using interference rules.
(2) By closure set of attributes.
INFERENCE RULES:
(1) Reflexive: If ‘B’ is a subset of ‘A’ then always ‘A’ can determine ‘B’ A B
(2) Augmentation: If A B then AC BC
(3) Transitive: If A B and B C then A C
(4) Union: It is applied for the LHS attributes i.e., If A B , A C then A BC
(5) Decomposition: If A BC then we can write it as A B and A C
(6) Composition: If A B and C D then, AC BD
(7) Self determination: A A , B B
1Q Find the additional F.D’s derived from F1 where a set of F.D’s from semantics.
(1) A B
A C
(2) B C
(3) C D AC
(4) D E F2= D EH
D EH
(5) D H FH
(6) E F
( 7) F G
F H
(8) G H
(1) Let ‘X’ be a set of attributes that will become the closure.
(2) Repeatedly search for a F.D where the LHS of F.D is a part of ‘X’ then add RHS of the
F.D to ‘X’ is already not available.
(3) Repeat step (2) as many times as necessary until no more attributes can be added to ‘X’.
(4) The set ‘X’ after no more attributes can be added to ‘X’ will become a closure set.
Applications of closure set of attributes:
AB
BC DE
AEG G
Find AC
Ans: X= AC
= ACB
= ABCDE
AC = ABCDE
A BC
CD E
BD
EA
Find B , AB & CD
Ans: X =B
=BD
B+ =BD
Find AB+
X = AB
= ABC
= ABCD
=ABCDE
AB = ABCDE
Find CD+
X= CD
= CDE
= ACDE
=ABCDE
CD = ABCDE
X= AB
= ABC
= ABCD
=ABCDE
AB = ABCDE
F1:
AB C
BC AD
DE
CF B
i. DA
X =D
=DE
D+=DE
D A i.e. Cannot be determine A
ii. AB D
X=AB
=ABC
=ABCD
AB+=ABCD
iii.AB F
X=AB
=ABC
=ABCD
=ABCDE
+
AB =ABCDE
AB cannot be determine F
Sol:
i. X=BCD
=BCDE
=BCDEH
=BCDEAH
BCD+=BCDEAH
BCD H
(ii) ABC H
X=ABC
ABC H
(2) Identification of key by using closure set as attributes
A key attribute: An attribute that is capable of identifying all other attributes in a given table.
i) Primary key:
It is an unique value attribute in a table to enforce entity integrity and ti
identify rows in the table uniquely.
ii) Composite Primary Key:
Sometimes single attribute is not sufficient to identify uniquely the rows in the table so,
we combine 2 or more attributes to identify the rows uniquely.
iii)Candidate keys:
Sometimes 2 or more independent attribute or attributes can be used to identify the rows
uniquely Eg :( vech no,veng no,purchase date)
Either vehicle no or vehicle engine no can be used as a key attribute then they are called
as candidate keys one of the candidate key can be elected as primary key.
iv) Surrogate key:
Sometimes even if you combine all the attributes in the table they may not have unique
values.
To identify the rows uniquely we will use a system generated key called as surrogate
key.
v)Foreign key:
It is used to enforce referential integrity and an attribute in a table can be called as a
foreign key attribute that refer primary key in same or different table.
A BC
CD E
EC
D AEH
AEH BD
DH BC
Find keys?
Super key=ABCDEH
Now find A+=ABC
E+=EC
D+=DAEH
=ABCDEH
D is key
If the closure of any of the LHS attributes are combinations of the LHS attributes
includes all the attributes in the table then that will become a key in the table.
A table can have 2 or more keys.
Q2 Consider a relation with five attributes ABCDE and FDs are
AB
BC E
ED A
A+=AB X
+
BC =BCE X
+
ED =ABDE X
AB+=AB X
AC+=ABCE X
BD+=BD X
ABC+=ABCE
BCD+=BCDEA
ACD+=ABCDE
CDE+=ADEBC
Q3 R(ABCDE) & FDs are
AB C
CD E
DE B
AB+=ABC
CD+=CDE
DE+=DEB
ABC+=ABC
ABD+=ABDCE
ABE+=ABEC
ACD+=ACDEB
ABD & ACD are keys
AB C
A DE
BF
F GH
D RJ
Note:
Sometimes all the attributes in the table may not appear in F.D’s
AB+=ABCDEFGHIJ
The key for the relation R=ABJ the missing attributes from the F.D’s must be attached to
the closure.
Sol:
Super key=ABDH
AB+=ABC X
BD+=BDEF X
AD+=ADGHIJ X
ABD+=ABDCEFGHIJ
ABH+=ABHCIJ X
ABD+=ABCDEFGHIJ
(3) To identify equivalence of F.D
Different database designers may define different F.D’s sets from the same
requirements.To evaluate whether they are equivalent if we are able to derive all F.D’s in
G from F and vice-versa.
AC
AC D
F=
E AD
EH
A CD
G=
E AH
Step 1: Take set F and enclose all FD’s in G that can be derived from F.
A CD
A+ from F
X=A
=AC
=ACD
A CD can be derived from F
E AH
E+ from F
X=E
=EAD
=EADH
E AH can be derived from F
Step 2: Take set G and enclose all F.D’s in F that can be derived from G.
AC
A+ from G
X=A
=ACD
A C can be derived from G
X=AC
=ACD
E AD
X=E
=EAH
=EAHCD
E AH & E ADfromG
F G so, G is preferable as it contains less FDs.
B CDE
G= B ABC
AD E
Sol:
Step 1:
B CDE
B+ from F
X=B
=BCDA
=ABCDE
All FD’s are derivable from F.
Step 2:
B CD
B+ from G
X=B
=BCDE
=ABCDE
All FD’s are derivable from G.
F G
F is preferable
No of dependencies are less.
(4) To identify the irreducible form of FD’s /canonical Form
Once F1 is identified from the semantics and F2 is derived from F1 we get total F.D’s i.e F
but before making a move to the normalization process with F,F must be evaluated for
redundant attribute on the LHS and RHS of F.D’s and it is a four step process.
Step 2: Evaluate all F.D’s in step 1 for their necessity. If they are not necessary, remove them
from the list.
Step 3: Evaluate the necessity of the RHS attributes in FD’s obtained from step 2.If they are not
necessary remove from FD.
Step 4: Apply the union rule for common to LHS attribute in the FD’s obtained from step 3.Then
we will get irreducible set.
F=
AB
CB
D ABC
AC D
Sol:
Step 1:
(1) A B
(2) C B
(3) D A
(4) D B
(5) D C
(6) AC D
Step 2:
Remove 1 & compute A+ from2, 3,4,5,6
A+=A
We need 1
Remove 2 and compute 1, 3, 4, 5&6
C+=C
We need 2.
Remove 3 and compute D+ from 1, 2, 4, 5&6
D+=DBC
We need 3.
Remove 3 and compute D+ from 1, 2, 4, 5&6
D+=DBC
We need 3.
Remove 4 and compute D+ from 1, 2, 4, 5&6
D+=ADCB
D B can be removed.
Remove 5 and compute D+ from 1, 2, 3,4&6
D+=ABD
We need 5.
Remove 6 and compute D+ from 1, 2,3, 4, 5
AC+=ACB
We need 6.
Step 3:
AB
CB
DA
DC
AC D
Remove A
AB AB
CB CB
DA DA
DC DC
CD AC D
C+=CDAB C+=CB
C+ C+
Remove C
AB AB
CB CB
DA DA
DC DC
AD AC D
A+=ADCB A+=AB
A+ A+
Step 4:
AB
CB
DA
DC
AC D
AB
CB
Therefore, it is an irreducible F.D.
D AC
AC D
AB C
C B
A B
Find the Irreducible set
Sol:
Step 2:
Remove (1) & compute AB+ from 2&3
AB+=AB
We need 1
Remove (2) & compute AB+ from 1&3
C+=c
We need 2
Remove (3) & compute A+ from 1&2
A+=A
We need 3
Step: 3
AB C
C B
A B
Remove A
AB C B C
C B C B
A B A B
B+=B B+=BC
B+ B+
Remove B
AB C A C
C B C B
A B A B
A+=ABC A+=ACB
A+=A+
B can be removed
Step 4:
A C
C B
A B
Q3 FDs are
F= ABD AC
C BE
AD BF
B E
Find the minimal set
Step 1:
ABD A
ABD C
C B
C E
AD B
AD F
B E
Remove (1) & compute ABD+ from (2-7)
ABD+ =ABDCEF
(1) can be removed
Remove (2) & compute ABD+ from (1,3-7)
ABD+ =ABDEF
We need (2)
Remove (3) & compute C+ from (1,2,4-7)
C+ =CE
We need (3)
Remove (4) & compute C+ C+ =BCE
(4) Can be removed
Remove (5) & compute AD+
AD+ =ADF
We need (5)
Remove (6) & compute AD+
AD+ =ADBCE
We need (6)
Remove (7) & compute B+
B+ =B
We need (7)
Step 3:
ABD C
C B
AD B
AD F
B E
Remove A
ABD C BD C
C B C B
AD B AD B
AD F AD F
B E B E
BD+=BDE BD+=BDCE
BD+ BD+
Remove B
ABD C AD C
C B C B
AD B AD B
AD F AD F
B E B E
AD+=ABFECD AD+=ADCFBE
AD+= AD+
B can be removed.
1. Partial F.D: A dependency in which non-key attributes are partially depending on key
attributes.
R=ABCD
F=AB C
=B D
Key: AB but B is depending only D therefore B D is considered as partial dependency
Eg:R=ABCD
F: AB C
AB D
C D
Key=AB
C d Is a transitive dependency and it causes insertion, deletion & updation problems in
the table.
3. Full F.D:A dependency X Y is considered as a full F.D if the removal of any attribute
from X makes X Y as invalid F.D
Eg: AB CD
B CDX
AB CD is a full F.D
NORMALIZATION
It is the process of reducing the redundancy based on primary keys and F.D
OR
It is a tool to validate or evaluate the logical database design with the help of rules which
are called as Normal Forms. They are
1 NF
2 NF
3 NF
BCNF Problem intensity reduces and no. of tables needed will be increased.
4 NF
5 NF
DKNF
Points to be Remember
Sol: key=AB
AB=CDEFGHIJ
Step 1: A B C D E F G H I J
Or
A+=ADEIJ
B+=BFGH
If there is a partial dependency, remove partially dependent attributes from the original table and
place it in a separate table along with the copy of its determinant.
R 1 = ADEIJ
(c) R 2 = BFGH R=ABC
R 3 = ABC
Required 2 NF
Q2 Consider the relation R=ABCDEF and set of FDs are
F=A FC
C D
B E
Find the key and normalize into 2NF
Sol:
(a) Key=AB
(b) A+=ACDF
B+=BE
R 1 = ACDF
(c) R 2 = BE R=AB
R 3 = AB
Requried 2 NF
Q3 Consider the relation R=ABCDE. Find the key and normalize upto 2NF
F=B E
C D
A B
Sol:(A) KEY=AC
(B) A+=ABE
C+=CD
(C) R=A
R 1 = ABE
R 2 = CD Required 2 NF
R 3 = AC
F=AB C
BD EF
AD GH
A I
H J
Find the key and normalize upto 2NF
R4=ABD
Third Normal Form (3 NF): A table is said to be in the 3 NF is it is already in the 2 NF and
must be free from transitive dependencies.
Anomalies can occur when a relation contains one or more transitive
A transitive dependency exists when ABC and NOT BA
A relation is in 3NF when it is in 2NF and has no transitive dependency
A relation is in 3NF when “All non-key attributes are dependent on the key, the whole
key and nothing but the key”.
If there is a transitive dependency, remove transitively dependent attribute from 2 NF table and
place it in a separate table along with the copy of its determinant.
Update anomalies occur in an 3NF relation R if
Transitive D IJ ADEIJ
DIJ
ADE
BFGH FGH
BP
ABC ABC
R 1 = DIJ R 1 = DIJ
R 2 = ADE R 2 = ADE
R 3 = FGH R 3 = FGH Iti sin 3NF
R 4 BF R 4 BF
R 5 ABC R 5 ABC
R 1 = CD
R 2 = ACF
in3NF
R 3 = BE
R 4 AB
Q3 Consider the relation R=ABCDE. Find the key and normalize upto 3NF
F=B E
C D
A B
R1=ABE BE
AB
R2=CD CD
R3=AC AC
R 1 = BE
R 2 = AB
Iti sin 3NF
R 3 = CD
R 4 AC
F=AB C
BD EF
AD GH
A I
H J
Normalize upto 3NF
R1=AI
R2=ABC
R3=BDEF
R4=ADGHJ HJ
ADG
R5=ABD
R 1 = AI
R 2 = ABC
R 3 = BDEF
Iti sin 3NF
R 4 HJ
R 5 ADG
R 6 ABD
Q5(a) Give a set of FDs for the relation schema R(ABCD) with primary key AB under
which R is 1NF but not in 2NF
(b) Find FDs such that R is in 2NF but not in 3NF
R=ABCD
Key=AB
Sol: (a) with these FD’s table cannot be in 2NF
B C A C
B D A D
(b ) with these FD’s the table may be in 2NF but not in 3NF
C D D C
Note 1: In general if x ; if AorB and x A or x b (key=AB) then it will violate 2 NF
Note 2: In general, if I have x and x is not a proper set of AB then it
Q:
R:
A B C
a1 b1 c1
a2 b2 c2
a3 b1 c3
R1
A B
a1 b1
a2 b2
a3 b1
R2
`
B C
b1 c1
b2 c2
b1 c3
A B C
a1 b1 c1
a1 b1 c3
a2 b2 c2
a3 b1 c1
a3 b1 c3
5 rows
Lossy decomposition
The above method is a time consuming and error prone there fore,to check the lossless joint
property we use the following short cut method.
i.e, If the common column b/w the relation consists unique value(only primary key willwill have
unique values) then the decomposition will become lossless otherwise it is a lossy
decomposition.
R1 R 2 R1
or
R1 R 2 R 2
Eg:consider a table consists R=ABCD attributes then F= A B,A C,C D is decomposed into
R1(ABC),R2(CD).find whether this decomposition is satisfying lossless join and dependency
preserving property.
Sol: R1 (ABC)
R2 (CD)
A B (R1)
A C (R1)
C D (R2)
A table is said to be in BCNF, if it is already in the 3NF and if every non trivial F.D has a
candidate key as its determinant.
OR
A table is said to be in BCNF if all determinants are keys in the 3NF table or they must be super
keys
Anomalies can occur in relation in 3NF if there are determinants in the relation that are
not candidate keys.
A relation is in BCNF if every determinant is a candidate key,
To test whether a relation is in BCNF, we identify all the determinants and make sure that
they are candidate key.
The following conditions are not properly handled by 3NF
3.It may preserve all dependencies It may not preserve all F.D’s
Q:
(1) R=ABCD
A D
C A
B C
Sol:
(a) Key=b
(b) At present A,B,C,D one in 2NF
(c) BACD
(2)B C
D A
Sol:
(a) Key=BD
(b) 1 NF
(c) 2 NF=BC,DA,BD
3 NF= BC, DA, BD
(d) BCNF= BC,DA,BD
BCD DA
Lossless decomposition.
(3) ABC D
D A
Sol:
(a) Key=ABC,BCD
(b) let key=ABC
3 NF
(c) BCNF=ABC
Super key ABC D
D A
Q. R ABCD
AB
BC D
AC
Sol:
a). key=A
b). 2NF
c). 3NF=BCD,ABC
d). BCNF:
A A
BCNF
BC BC
Q). AB C
AB D
CA
DB
Sol:
a). key=AB,BC,CD,AD
3NF
c).AB AB
C CA X
D DB X
OR
Sol is 3NF
Q). AB CEFG
AD
FG
FB H
HBC ADEFG
FBC ADE
Sol:
a).key = AB,HBC,FBC
ABCEFGH
2NF=AD,ABCEFGH
c). 3NF=AD,ABCEFGH FG
ABCEFH
FBH
D).BCNF
AB A
A F
F AB
FB
HBC
FBC
BCNF=FG,AD,FBH,ABCEF
Q). R=ABCD
R1=BC &AD(R2)
(1).
BC
DA
Sol:
(1) a).key=BD
b).R1 R2=0
lossy decomposition
bad decomposition
(2).AB C
CA R1=ACD
CD R2=BC
A). key=AB,BC
b). R1 R2=C
it is a lossless decomposition as common attribute ’c’ can become a key for the first table
ACD.
(3). A BC R1=ABC
C AD R2=AD
A).KEY=A
b). R1 R2=A
(4). A B
B C R1=AB
C D R2=ACD
Sol:
a).key= A
b). R1 R2=A
Lossless
FD not preserved B C
Q).
A B R1=AB
B C R2=AD
C D R3=CD
Sol:
It is a lossy decomposition
Q).
R=ABCDE
AB DE
A C
D E
Sol:
(a) Key=AB
(b) 2 NF=ABDE,AC
(c) 3 NF=ABD,DE,AC
(d) BCNF=AB AB
D A
D A
Q)
AB CDE
C A
D E
Sol:
(a) Key=AB,BC
(b) choose AB as a key and no partial dependencies
3 NF=ABCD, DE
BCNF= ABCD,CA,DE
Limitations of Normalization:
The possible existence of multi-valued dependencies in a relation is due to first normal form
(1NF), which disallows an attribute in a tuple from having a set of values.
For example, if we have two multi-valued attributes in a relation, we have to repeat each value of
one of the attributes with every value of the other attribute, to ensure that tuples of a relation are
consistent. This type of constraint is referred to as a multi-valued dependency and results in data
redundancy.
Consider the Employee relation which is not in 1NF shown in figure:
{ 040-222222,
111 040-222222 BA
111 040-333333 BA
This relation records the EmpPhone and EmpDegrees details of an employee 111. However, the
EmpDegrees of an employee are independent of EmpPhone. This constraint results in data
redundancy and is referred to as multi-valued dependency.
Represents a dependency between attributes (for example, A, B, and C) in a relation, such that
for each value of A there is a set of values for B and a set of values for C. However, the set of
values for B and C are independent of each other.
We represent an MVD between attributes A, B, and C in a relation using the following notation:
A →→ B
A →→ C
For example, we specify the MVD in the above Employee relation as follows:
EmpNum →→ EmpPhone
EmpNum →→ EmpDegrees
If a relation R satisfies X →→Y2, the following must be true for every legal instance of r of R:
if for any two tuples t1, t2 and t1(X) = t2(X), then there exist t3 in r such that
t3(X) = t1(X), t3(Y) = t1(Y), t3(Z) = t2(Z).
By symmetry, there exist t4 in r such that, t4(X) = t1(X), t4(Y) = t2(Y), t4(Z) = t1(Z).
X Y Z
x1 y1 z1 t1
x1 y2 z2 t2
x1 y1 z2 t3
x1 y2 z1 t4
The MVD X→→ Y says that the relationship between X and Y is independent of the
relationship between X and R─ Y.
Armstrong Axioms rules relate to MVDs:
Trivial MVD:
2
X→→Y can be read as X multi-determines Y.
An MVD A→→B in relation R is defined as being trivial if,
a) B is a subset of A or
b) A U B = R.
Y is a subset of X or XY = R, or
X is a super key.
Example:
We decompose the Employee relation into Emp1 and Emp2 relations as shown below:
Emp1 Emp2
EmpNum EmpPhone
111 040-222222
EmpNum EmpDegrees
111 BA
111 BSc
111 040-333333
Both new relations are in 4NF because the Emp1 relation contains the trivial MVD
EmpNum→→EmpPhone, and the Emp2 relation contains the trivial MVD
EmpNum→→EmpDegrees.
Properties of Decomposition
Lossless-Join Decomposition:
Let R be a relation schema and let F be a set of FDs over R. A decomposition of R into two
schemas with attribute sets X and Y is said to be a lossless-join decomposition with respect to
F if, for every instance r of R that satisfies the dependencies in F, ∏ x (r) ∏y (r) = r. In
other words, we can recover the original relation from the decomposed relations.
From the definition it is easy to see that r is always a subset of natural join of decomposed
relations. If we take projections of a relation and recombine them using natural join, we typically
obtain some tuples that were not in the original relation.
Example:
By replacing the instance r shown in figure with the instances ∏SP (r) and ∏PD (r), we lose some
information.
S P D S P P D
s1 p1 d1 s1 p1 p1 d1
s2 p2 d2 s2 p2 p2 d2
s3 p1 d3 s3 p1 p1 d3
s1 p1 d1
s3 p1 d3
s1 p1 d3
s3 p1 d1
Theorem: Let R be a relation and F be a set of FDs that hold over R. The decomposition of R
into relations with attribute sets R1 and R2 is lossless if and only if F+ contains either the FD R1
∩ R2 → R1 (or R1─R2) or the FD R1 ∩ R2 → R2 (or R2─R1).
Consider the Hourly_Emps relation. It has attributes SNLRWH, and the FD R→W causes a
violation of 3NF. We dealt this violation by decomposing the relation into SNLRH and RW.
Since R is common to both decomposed relations and R→W holds, this decomposition is
lossless-join.
Dependency-Preserving Decomposition:
Consider the Contracts relation with attributes CSJDPQV. The given FDs are C→CSJDPQV,
JP→C, and SD→P. Because SD is not a key, the dependency SD→P causes a violation of
BCNF.
We can decompose Contracts into relations with schemas CSJDQV and SDP to address this
violation. The decomposition is lossless-join. But, there is one problem. If we want to enforce an
integrity constraint JP→C, it requires an expensive join of the two relations. We say that this
decomposition is not dependency-preserving.
Let R be a relation schema that is decomposed into two schemas with attributes sets X and Y,
and let F be a set of FDs over R. The projection of F on X is the set of FDs in the closure F+ that
involve only attributes in X. We denote the projection of F on attributes X as F X . Note that a
dependency U→V in F+ is in FX only if all the attributes in U and V are in X.
The decomposition of relation schema R with FDs F into schemas with attribute sets X and Y is
dependency-preserving if (FX U FY)+ = F+.
Example:
Consider the relation R with attributes ABC is decomposed into relations with attributes AB and
BC. The set of FDs over R includes A→B, B→C, and C→A.
The closure of F contains all dependencies in F plus A→C, B→A, and C→B. Consequently F AB
contains
A→B and B→A, and FBC contains B→C and C→B. Therefore, FAB U FBC contains A→B,
B→C, B→A
and C→B. The closure of FAB and FBC now includes C→A (which follows from C→B and
B→A). Thus