Introduction to Databases
Definition of Data, Database, and DBMS
Data: Raw, unorganized facts and figures, such as text,
numbers, images, or sounds. When processed, data becomes
information.
Database: A structured, organized collection of related data,
typically stored electronically. It's designed for efficient storage,
retrieval, and management of data.
Database Management System (DBMS): Software that
interacts with end-users, applications, and the database itself to
manage and manipulate data. It provides a systematic way to
create, retrieve, update, and manage data.
Overview of Database Applications
Databases are essential in numerous sectors. Examples include:
E-commerce: Managing product inventories, customer
information, and transactions.
Banking: Handling customer accounts, transactions, and loans.
Healthcare: Storing patient records, appointment schedules, and
medical history.
Education: Managing student records, courses, and grades.
Social Media: Storing user profiles, posts, and connections.
Advantages and Disadvantages of DBMS
Advantages:
Data Redundancy Control: Reduces data duplication, saving
storage space and improving consistency.
Data Consistency: Ensures data is accurate and consistent
across the database.
Data Sharing: Allows multiple users to access and share data
simultaneously.
Data Security: Provides mechanisms for controlling access to
data and protecting it from unauthorized use.
Backup and Recovery: Offers tools for backing up data and
restoring it in case of system failures.
Disadvantages:
Cost: High initial investment in hardware, software, and
training.
Complexity: DBMSs can be complex to design, install, and
manage.
Performance: Can be slower than traditional file systems for
simple operations due to overhead.
Size: Requires significant storage space and memory.
Roles of Database Users and Administrators
Database Users: Individuals who interact with the database.
This includes:
o Naive Users: End-users who use applications to interact with
the database without knowing its internal structure (e.g., bank
customers).
o Application Programmers: Develop and maintain applications
that access the database.
o Sophisticated Users: Users with in-depth knowledge of the
database, who can formulate queries using languages like SQL.
Database Administrator (DBA): A skilled professional
responsible for managing and maintaining the database system.
Their roles include:
o Installing and upgrading the DBMS.
o Managing user accounts and security.
o Monitoring database performance.
o Performing backups and recovery.
o Designing and implementing the database schema.
Data Models
Introduction to Data Models
A data model is a conceptual tool used to describe data, data
relationships, data semantics, and consistency constraints. It
provides a blueprint for how a database is structured and how
data is stored, organized, and manipulated.
Types of Data Models
Hierarchical Model: Data is organized in a tree-like structure,
with a single root and a parent-child relationship (1:N). A child
can only have one parent.
Network Model: An extension of the hierarchical model, where
a child can have multiple parents. This creates a graph-like
structure.
Relational Model: Data is organized into two-dimensional
tables (relations). Each table has rows (tuples) and columns
(attributes). This model is the most widely used today due to its
simplicity and flexibility.
Object-oriented Model: Data and its relationships are
encapsulated into a single structure called an object. This model
supports complex data types and inheritance.
Importance of Data Models in DBMS
Communication: Provides a common language for developers,
designers, and users to understand the database structure.
Conceptualization: Helps in visualizing and planning the
database before implementation.
Standardization: Ensures consistency and uniformity in
database design.
Abstraction: Hides the physical storage details from the users,
allowing them to focus on the logical view of the data.
Database Design
Keys
Keys are essential for uniquely identifying rows in a table and
establishing relationships between tables.
Super Key: A set of one or more attributes that, taken together,
can uniquely identify a tuple in a table.
Candidate Key: A minimal Super Key. It's a Super Key from
which no attribute can be removed without losing its uniqueness
property.
Primary Key: A Candidate Key chosen by the database
designer to uniquely identify each row in a table. A table can
have only one Primary Key. It cannot contain NULL values.
Foreign Key: A set of attributes in one table that refers to the
Primary Key of another table. It establishes a link between the
two tables.
Composite Key: A key that consists of two or more attributes to
uniquely identify a record.
Alternate Key: A Candidate Key that is not chosen as the
Primary Key.
Unique Key: A key that ensures all values in a column are
unique. It can contain NULL values, unlike a Primary Key.
Surrogate Key: An artificially created key, usually a number,
that is used as a Primary Key. It has no meaning or relationship
to the data itself.
Constraints in a Table
Constraints are rules that enforce data integrity and prevent
inconsistent data.
Primary Key Constraint: Ensures that a column or set of
columns uniquely identifies each row and does not contain NULL
values.
Foreign Key Constraint: Ensures that the values in a Foreign
Key column match the values in the Primary Key column of the
referenced table, maintaining referential integrity.
Unique Key Constraint: Guarantees that all values in a
specified column are unique within the table.
NOT NULL Constraint: Prevents a column from having a
NULL value.
CHECK Constraint: Defines a condition that each row must
satisfy. It's used to enforce a specific rule on the data in a
column.
Entity-Relationship (ER) Model
The ER Model is a high-level data model used to design a
database. It provides a visual representation of the entities, their
attributes, and the relationships between them.
Entities and Entity Sets: An entity is a real-world object or
concept (e.g., an employee, a product). An entity set is a
collection of similar entities.
Attributes and Relationships: An attribute is a property of an
entity (e.g., an employee's name or salary). A relationship
describes how two or more entities are associated with each
other (e.g., an employee works for a department).
ER Diagrams: A graphical representation of the ER Model
using a specific set of symbols: rectangles for entity sets,
diamonds for relationship sets, and ovals for attributes.
Key Constraints and Weak Entity Sets
Key Constraints (Cardinality Ratios): Define the number of
entities that can participate in a relationship.
o One-to-one (1:1): An entity in set A is related to at most one
entity in set B.
o One-to-many (1:N): An entity in set A is related to many
entities in set B.
o Many-to-many (M:N): An entity in set A is related to many
entities in set B, and vice versa.
Weak Entity Sets: An entity set that does not have a Primary
Key of its own. Its existence depends on another entity, called a
"strong" or "identifying" entity. It is identified by its own
attributes and the Primary Key of the strong entity.
Extended ER Features
Specialization/Generalization: A process of forming a
hierarchy of entity sets based on their shared attributes.
Generalization is a bottom-up approach (e.g., Car, Truck,
Motorcycle are generalized to Vehicle). Specialization is a top-
down approach (e.g., Employee is specialized into Pilot,
Manager, Engineer).
Aggregation: A feature that allows us to treat a relationship set
as a higher-level entity set. This is useful for representing
relationships among relationships.
Introduction to the Relational Model and Relational Schema
The Relational Model, proposed by E.F. Codd, is the
foundation of modern databases. It organizes data into two-
dimensional tables.
Relational Schema: The description of a relational database. It
includes:
o Table Name: The name of the relation.
o Attributes: The columns of the table.
o Domains: The set of possible values for each attribute.
o Primary Key: The attribute(s) that uniquely identify each row.
o Foreign Keys: The attributes that link to other tables.
A relational schema for a Students table might look like this:
Students(StudentID, FirstName, LastName, Major,
EnrollmentYear). StudentID is the Primary Key.