DBMS Unit-3
DBMS Unit-3
ER Model
In 1976 Chen developed the Entity Relationship (ER) model a high level data model that is useful in developing a conceptual design for a database Constructing ER diagram is a first step in designing a database ER model defines the data elements and relationships among them ER data model is based on perception of real world data consisting of entities (data items) and relationships among those entities Popular high level conceptual model used for conceptual design of database
ER Diagrams
Uses ER model for solving design problems Diagrammatic notation associated with ER model Consists of Entity, Attributes and Relationships Diagrams/Notations used in ER diagrams
Rectangle entity sets Ellipses attributes Diamonds relationship sets Lines link attributes to entities and entities with entities Double Lines indicates total participation of an entity in a relationship Double Rectangles weak entity sets
Entities
A fundamental component of ER model It is a thing in real world with its own independent existence. E.g. Student, Faculty May be an object with physical or logical existence Has its own properties that describe the entity known as attributes Entity Set
Collection of all entities of same type Strong Entity Entity type having own distinct primary key by which we can identify specific entity uniquely. E.g. Empno in Emp table, RollNo in Student table. Represented by single rectangle Weak Entity Entity type which cannot form distinct primary key from their attributes. These type of entities are dependent on stron entity for primary key. Some weak entities contain virtual primary key called as Discriminator. Represented by double rectangle
Attributes
Various properties that describe an entity Attribute value that describes each entity becomes a major part of data stored in database as each entity will have some value for each of its attributes. E.g. Employee entity has name, age, phone etc. as attributes Simple Attributes
Which cannot be divided into sub parts E. g. Salary of employee
Composite Attributes
Which can be divided into sub parts E.g. Name can be divided into FirstName and LastName
Attributes Contd
Stored Attributes
Simple attributes stored in database E.g. DateOfJoin for Employee
Derived Attributes
Value of this attribute is derived from value of related stored attribute E.g. EmployeeTenure can be calculated from DateOfJoin
Null Attribute
Can take a null value when entity does not have a value for it or the value is unknown. E.g. Commission attribute in Employee table specifies whether the Employee has commission or not
Key Attribute
Must have a unique value by which any row can be identified. E.g. Deptno for department table
Key Attribute
Composite Attribute
Derived Attrbute
Relationships
An association among several entities Use diamond to illustrate in ER diagrams and read from left to right Degree
Number of participating entities in a relation
Relationship Set
Collection of all relationship of same type
Employee Work s For Department
Constraints On Relationships
Mapping Constraints / Cardinalities
Number of entities to which another entity is associated Type
ONE is to ONE One tuple in entity is related only with one tuple in another entity. One department can have only one manager ONE is to MANY One tuple in entity is related with many tuples in another entity. One department can have many employees MANY is to MANY Many tuples in entity is related many tuples in another entity. Books in library issued by students
Partial Participation
More than one object in an entity may participate in a relationship Indicated by single line between entity and relationship Employee works for department
Min-Max Notation
(Min , Max) notation represents entity is related to at least min, at most max relationship instance in relationship set
(1,1)
(1,N) (1,N)
Sub Class
A sub grouping of super class More specific version of super class Inherits properties and attributes from its super class
Generalization
Reverse process of specialization or bottom up approach of super class/ subclass relationship Process in which we differentiate among several entity types identifying their common features and generalizing them into single super class of which original entity type are special subclasses E.g. Car and Bike do have several common attributes that can be generalized to super class vehicle In diagrammatical notation arrow pointing to generalized super class represents generalization and arrow pointing to generalized subclass represents specialization Attributes created of higher or lower level entities are attributes inheritance
Codd s Rule
Information Rule All available data in system should be represented as relations or tables. Guaranteed Access Rule Each data item must be accessible without ambiguity by providing table name and its primary key of the row also include its column name to be accessed Systematic Treatment Of Null Values Null values are not equal to blank space or zero they are unknown unassigned values which should be treated properly Self Describing Database There should be dynamic online catalog based dictionary on relational model which keep information about tables data in database Comprehensive Data Sublanguage The data access language (SQL) must be the only means of accessing data stored in the database and support DML, DDL etc. View Updating Rule All views of data are theoretically updateable can be updated using system also
Codd s Rule
High Level Insert, Update And Delete This rule states that in a relational database , the query language must be capable of performing manipulations on sets of rows in a table Physical Data Independence Any changes made in the way is physically stored must not affect applications that access data Logical Data Independence This rule states that changes to the database to the database design should be done in a way without the users being aware of it Integrity Independence Data integrity constraints which are definable in the language must be stored in the database as data in table is, in the catalog and not in the application program Distribution Independence In a RDBMS data can be stored centrally that is on a system or distributed across multiple systems Non Subversion Rule This rule states that there should be no bypass of constraints by any other languages
Tuple/Records
A single row or tuple contains all the information about a single entity Each horizontal row of the table represents a single entity A table can have ay number of rows from zero to thousand If number of rows are zero then it is called as empty table
Key
The column value that uniquely identifies a single record in the table is called as KEY of table An attribute or set of attributes whose values uniquely identify each entity in an entity set is called as key for that entity set Any key consisting of single attribute is called a simple key while that consisting of a combination of attributes is called a composite key
Types Of Keys
Super Key A key attribute with additional attributes that uniquely identifies a single record in a table Candidate Key Super key without its unnecessary attributes Primary Key Column or combination of columns whose values uniquely identify a single row in that table Secondary Key Column or combination of columns used for data retrieval process Foreign Key A column or collection of columns in one table must match the primary key in some other table. This link is also called as referential integrity
Integrity Rules
Entity Integrity All primary key entries are unique and no part of primary key may be null. Each row will have unique identity and foreign key values can properly reference primary key values Referential Integrity It can have a null entry as long as it is not a part of its tables primary key or an entry that matches the primary key value in a table to which it is related. It is possible for an attribute NOT to have a corresponding value but it will be impossible to have an invalid entry. The enforcement of referential integrity rule makes it impossible to delete a row in one table whose primary key has mandatory matching foreign key values in another table Not Null As per requirements there are some values which should not be having any NULL value Unique In this case no two tuples can have equal value for same attribute Check Define own integrity rule using CHECK constraint
Goals Of Normalization
Ensures Data Integrity Data integrity ensures the correctness of data stored within the database and can be achieved by imposing data integrity rules. An integrity rules restricts values present in the database Prevents Redundancy A non normalized data is stored in different locations and hence modification makes data inconsistent. A normalized data stores data only in one place. Direct redundancy can result due to presence of same data in two different locations . Indirect redundancy results due to storing information that can be computed from the other data items stored within the database Data Anomalies Update Anomaly Same information can be present in multiple records of various tables hence update to only one table will result in inconsistency Insertion Anomaly There is a possibility in which certain facts cannot be recorded at all or that are not yet recorded Deletion Anomaly Deletion of some data from a relation necessitates the deletion of unrelated data also
Disadvantages Of Normalization
Increases number of relations
As normalization involves the decomposition of relations into multiple relations or tables hence higher degrees of normalization typically involve more tables. Therefore if highly normalized tables are used in database applications then the application becomes complex
Reduces performances
Higher degrees of normalization involve more tables and create the need for a larger number of joins which can reduce performance
Some redundancies are unavoidable. While normalizing the tables data integrity should not be compromised
Normal Forms
Forms are designed to logically address potential problems such as inconsistencies and redundancy in information stored in the database. A database is said to be in one of the Normal Forms if it satisfies the rules required by that form as well as the previous form and it will also not suffer from any of the problems addressed by the form Types Of Normal Forms
First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce Codd Normal Form (BCNF) Fourth Normal Form (4NF) Fifth Normal Form (5NF)
Un Normalized Data
Faculty Code 100 Faculty Name Yogesh Date Of Birth 17/07/64 Subject DSA SS IS MIS 101 Amit 24/12/72 PM IS PWRC 102 Omprakash 03/02/80 PCOM IP DT 103 Nitin 28/11/66 PCOM SS DT 104 Mahesh 01/01/86 ADBMS PWRC Hours 16 8 12 16 8 12 8 8 16 10 8 8 10 8 8
1NF Table
Faculty Code 100 100 100 101 101 101 102 102 102 103 103 103 104 104 104 Faculty Name Yogesh Yogesh Yogesh Amit Amit Amit Omprakash Omprakash Omprakash Nitin Nitin Nitin Mahesh Mahesh Mahesh Date Of Birth 17/07/64 17/07/64 17/07/64 24/12/72 24/12/72 24/12/72 03/02/80 03/02/80 03/02/80 28/11/66 28/11/66 28/11/66 01/01/86 01/01/86 01/01/86 Subject DSA SS IS MIS PM IS PWRC PCOM IP DT PCOM SS DT ADBMS PWRC Hours 16 8 12 16 8 12 8 8 16 10 8 8 10 8 8
A relation is in 2NF if it is in 1NF and every non-key attribute is fully functionally dependent on primary key of the relation and not just part of the primary key 2NF prohibits partial dependencies Steps
Find and remove attributes that are related to only a part of the key Group the removed attributes in another table Assign the new table the key that consists of that part of the old composite key
Anomalies
Inserting records of various faculties teaching the same subject results in redundancy of hours information As number of hours is repeated any change done has to repeated for every instance If a faculty leaves the organization information regarding the subject is also lost
Advantages
No redundancy of data for subject and hours while inserting records Subject and hours are stored in separate table so Updation becomes easier Even if faculty record is deleted subject hours can still be retrieved
SoldierId 1 2 3
UnitId 1 1 2
OfficerId A A B
UnitId 1 1 2
Introduction To UML
UML or Unified Modeling Language is a specification language that is used in the software engineering field It can be defined as general purpose language that is used to design an abstract model which can be used in the system. This system is called UML model UML is commonly used to visualize and construct software oriented systems. Because software has become much more complex nowadays, developers are finding it more challenging to build complex application within short periods of time UML is specially proposed standard for creating specifications of various components of a complex software system
Implementation diagram
Are deployment diagrams which are used for describing about the hardware components where software components are deployed
Disadvantages
Still no specification for modeling of graphical user interface Poor for distributed systems no way to formally specify serialization and object persistence