BITS Pilani
Dr. Lavika Goel
BITS Pilani Assistant Professor
Department of CSIS
Hyderabad Campus
Pilani Campus
BITS Pilani
Hyderabad Campus
Pilani Campus
Database Design and Applications (CSI
ZG518/ SSZG518)
Today’s Agenda
• General course information
• Objectives
• Reasons to take this course
• Course Logistics
• Course outline
• Introduction to Database Systems
Database Design and Applications (CSI ZG518/ SSZG518) BITS Pilani, Hyderabad Campus
Pilani Campus
General Course Information
Instructor-in-Charge: Dr. Lavika Goel
Office: 6120-J
Email: [email protected]
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Objectives
• Introduce the field of Database Management Systems
(DBMS).
• Basic foundation, Principles, Methods etc.
• Trends: Frontier Topics
• Prepare you to take up challenges in DBMS or related
fields.
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Reasons to take this course
• Information changes people’s life.
• Users need to find useful information: Very HUGE
amount of data.
• Industries need to earn money through database
management, storage and efficient retrieval: Google,
Yahoo, Bing, etc.
• Many interesting problems in DBMS field haven’t been
solved.
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Course Logistics
15%
Quizzes
10%
45% Assignment
Midsem Examination
Comprehensive Examination
30%
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Text Book
Ramez Elmasri & Shamkant B. Navathe, Database Systems; Models, Languages,
Design and Application Programming, Pearson Education, 7th Edition, 2017.
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Reference Books
R1 Abraham Silberschatz, Henry F Korth and S Sudarshan, Database System
Concepts, McGraw Hill, 6th Ed., 2013
R2 Date C.J., An Introduction to Database Systems, Addison Wesley, 8th Ed.,
2006.
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Course Outline
• Overview of Database Systems
• Foundations
– Data Models: ER, Relational Models
– Query languages : RA, SQL
• Design & Development
– Normalization, Application Development
• Efficiency & Scalability
– Storage & Indexing
– Query evaluation & optimization
• Concurrency & Robustness
– Transaction Management – concurrency, recovery
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Why Study Databases??
Simply fascinating
– Commercially very relevant !!
DBMS encompasses most of CS
– OS, languages, theory, AI, multimedia, logic
Significance of Databases with Internet
Datasets increasing in diversity and volume.
– Numeric and Textual Databases
– Multimedia Databases
– Geographic Information Systems (GIS)
– Data warehousing, Data mining, Business Intelligence
– Digital libraries, interactive video.
– ... need for DBMS exploding
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Tsunami of Data
Telecom data ( 4.6 bn mobile subscribers)
There are 3 Billion Telephone Calls in US each day,
30 Billion emails daily, 1 Billion SMS, IMs.
IP Network Traffic: up to 1 Billion packets per hour per router. Each ISP has many
(hundreds) routers!
WWW
Weblog data (160 mn websites)
Email data
Satellite imaging data
Social networking sites data
• No. of pics on Facebook
– 15 bn unique photos
– 60 bn photos stored (4 sizes)
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Big Names in Database Systems
Company Product
Oracle Oracle 8i, 9i, 10g,11g, 12c, 18c
IBM DB2, Universal Server
Microsoft Access, SQL Server-2008
Sybase Adaptive Server
Informix Dynamic Server
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Who Needs Database Systems
Typical Applications:
Corporate databases Personnel management
Inventory and purchase order
Insurance policies and customer data
……
Web data management Typical Applications:
Web page management
Personalize web pages
……
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Examples of Database Applications
Banking services
Airline reservations
Purchases using your credit card
Using the local library
Taking out insurance
Using the Internet
Studying at university
Finance
Human Resources
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
What is a Database, DBMS,
Database Systems?
A very large, integrated collection of structured data.
– Gigabytes (230 or 109), Terabytes, Petabytes
Models real-world enterprise.
– Entities (e.g., students, courses)
– Relationships (e.g., Mohan is taking ISC332)
A Database Management System (DBMS) is a software
package designed to store and manage large databases
with complex features.
Goal : Store and Retrieve database information
conveniently and efficiently
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Purpose of Database Management Systems (DBMS)
Database management systems were developed to
handle the difficulties caused by different people writing
different applications independently.
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Typical DBMS Functionality
• Define a database: in terms of data types, structures and
constraints
• Construct or Load the Database on a secondary storage medium
• Manipulating the database: querying, generating reports,
insertions, deletions and modifications to its content
• Concurrent Processing and Sharing by a set of users and
programs – yet, keeping all data valid and consistent
• Other features:
– Protection or Security measures to prevent unauthorized access
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Purposes of Database Systems
A DBMS attempts to resolve the following problems:
– Data redundancy and inconsistency by keeping one copy of a data item in the
database
– Difficulty in accessing data by providing query languages and shared libraries
– Data isolation (multiple files and formats)
– Integrity problems by enforcing constraints (age > 0)
– Atomicity of updates
– Concurrent access by multiple users
– Security problems
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Data Independence
One big problem in application development is the separation of applications
from data
Do I need to change my program when I …
– replace my hard drive?
– store the data in a b-tree instead of a hash file?
– partition the data into two physical files (or merge two physical files into one)?
– store salary as floating point number instead of integer?
– develop other applications that use the same set of data?
– add more data fields to support other applications?
– ……
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Data Independence
Applications insulated from how data is structured and
stored.
Logical data independence: Protection from changes in
logical structure of data.
Physical data independence: Protection from changes in
physical structure of data.
* One of the most important benefits of using a DBMS!
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Data Abstraction
The answer to the previous questions is to introduce levels
of abstraction.
Consider how do function calls allow you to change a part
of your program without affecting other parts?
Main
function function data
Program
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Levels of Abstraction
View 1 View 2 View 3
– View level describe how users see the data.
Logical Level
– Logical/Conceptual level defines
Physical Level
logical structure
– Physical level describes the files and
indexes used.
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Example: University Database
Conceptual / Logical Level:
– Students(sid: string, name: string, login: string,
age: integer, gpa:real)
– Courses(cid: string, cname:string, credits:integer)
– Enrolled(sid:string, cid:string, grade:string)
Physical Level:
– Relations stored as unordered files.
– Index on first column of Students.
View Level:
– Course_info(cid:string,enrollment:integer)
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Instances and Schemas
Each level is defined by a schema, which defines the
database design at the corresponding level
– A logical schema defines the logical structure of the database (e.g., set of
customers and accounts and the relationship between them)
– A physical schema defines the file formats and locations
– Subschema describes the database at the view level
– Many views, single conceptual (logical) schema and physical schema.
* Schemas are defined using DDL; data is modified/queried using DML.
A database instance refers to the actual content of the
database at a particular point in time. A database
instance must conform to the corresponding schema
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Schema diagram for UNIVERSITY database
schema construct
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
UNIVERSITY Database
Instance
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Storage Management
A storage manager is a program module that provides the interface between
the low-level data stored in the database and the application programs and
queries submitted to the system.
The storage manager is responsible for the following tasks:
– interaction with the file manager
– efficient storing, retrieving, and updating of data.
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Query Processing
1. Parsing and translation
2. Optimization
3. Evaluation
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Query Processing (Cont.)
Alternative ways of evaluating a given query
– Equivalent expressions
– Different algorithms for each operation
Cost difference between a good and a bad way of
evaluating a query can be enormous
Need to estimate the cost of operations
– Depends critically on statistical information about
relations which the database must maintain
– Need to estimate statistics for intermediate results to
compute cost of complex expressions
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Transaction Management
•A transaction is a collection of operations that performs a single logical
function in database application
time
Transaction 1
Transaction 1
Transaction 2
Conflicting read/write
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Transaction Management (cont.)
Transaction-management component ensures that
the database remains in a consistent (correct) state
despite system failures (e.g. power failures and
operating system crashes) and transaction failures.
Concurrency-control manager controls the
interaction among the concurrent transactions, to
ensure the consistency of the database.
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Database Administrator (DBA)
Coordinates all the activities of the database system; the database administrator has good
understanding of the enterprise’s information resources and needs.
Database administrator’s duties include:
– Schema definition Primary job of a database
– Specifying integrity constraints designer
– Storage structure and access method definition
– Schema and physical organization modification
More system
– Granting user authority to access the database oriented
– Monitoring performance and responding to changes in requirements
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Database Users
Users are differentiated by the way they expected to interact with the system
Application programmers
– Develop applications that interact with DBMS through DML calls
Sophisticated users
– form requests in a database query language
– mostly one-time ad hoc queries
End users
– invoke one of the existing application programs (e.g., print monthly sales report)
– Interact with applications through GUI
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Overall System Architecture
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Application Architectures
Two-tier architecture: E.g. client programs using ODBC/JDBC
to communicate with a database
Three-tier architecture: E.g. web-based applications, and
applications built using “middleware”
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Characteristics of a Modern DBMS
Data independence.
– Abstraction - hiding lower level details
Efficient data access
– Indexing - Significant for very large databases
Data integrity and security
– Application independent data integrity features
– Simpler Access control mechanisms - Views
Concurrent access, recovery from crashes.
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Data Models: Motivation
• Data models define how the logical structure of a
database is modeled.
• Data Models are fundamental entities to introduce
abstraction in a DBMS.
• Data models define how data is connected to each
other.
• Represents graphically complex real-world data
structures.
• Facilitate interaction among the designer, the
applications programmer and the end user.
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Categories of data models
• Relational Model: models data and relationships in the form of tables with
fixed attributes.
• Entity-Relationship Model: models real world objects as entities and
relationships between these objects.
• Object-based Data Model: includes concepts like encapsulation,
inheritance etc. (Object relational data model)
• Semi-structured data model: allows more flexibility by permitting individual
data items of the same type to have different sets of attributes, ex. XML
• Network and Hierarchical data models.
Database Design and Applications (CSI ZG518/ SSZG518) Pilani Campus
BITS Pilani, Hyderabad Campus
Ques
• What is a relation (in mathematics) ?
Database Design and Applications (CSI ZG518/ SSZG518) BITS Pilani, Pilani Campus
Relation
• In mathematics, an n-ary relation on n sets, is
any subset of Cartesian product of the n sets.
Source: Wikipedia
Database Design and Applications (CSI ZG518/ SSZG518) BITS Pilani, Pilani Campus
Relational Database
• Relational Database consists of a collection of
tables.
• Each table is assigned a unique name.
• Table depicts the relation between different
attributes.
• Row in a table represents a relationship among a
set of values.
• Relationship between n values is represented by
n-tuple in mathematics which corresponds to a
row in a table.
Database Design and Applications (CSI ZG518/ SSZG518) BITS Pilani, Pilani Campus
Prereq relation
Instructor relation
Course relation
Database Design and Applications (CSI ZG518/ SSZG518) BITS Pilani, Pilani Campus
Terminology
• In relational model, relation refers to a table.
• Tuple refers to a row.
• Attribute refers to a column of a table.
• The order in which tuples appear in a relation is
irrelevant, since a relation is a set of tuples.
• Domain of an attribute is a set of permitted values.
– Attribute: Age Domain: [0-100]
– Attribute: EmpName Domain: 50 alphabetic chars
– Attribute: Salary Domain: non-negative integer
Database Design and Applications (CSI ZG518/ SSZG518) BITS Pilani, Pilani Campus
Relation
• A domain is atomic, if elements of domain are considered to
be indivisible units.
• Null value: signifies that value is unknown or does not
exist.
• Relational schema: list of attributes and their corresponding
domains (optional)
• Students(sid: string, name: string, age: integer, cgpa: real).
• Relational instance refers to specific instance of a relation,
containing specific set of rows.
• #Rows cardinality, #fields degree / arity
• Relation == variable
• Relational schema== type definition
• Relational instance == value of the variable
Database Design and Applications (CSI ZG518/ SSZG518) BITS Pilani, Pilani Campus
Student relation
Relation Name/Table Name Attributes/Columns (collectively as a
schema)
STUDENT
Name Student-id Age CGPA
Chan Kin Ho 99223367 23 8.19
Lam Wai Kin 96882145 17 10.00
Man Ko Yee 96452165 22 8.75
Lee Chin Cheung 96154292 16 10.00
Alvin Lam 96520934 15 9.65
Cardinality = 5, degree = 4, all rows distinct
Database Design and Applications (CSI ZG518/ SSZG518) BITS Pilani, Pilani Campus
Department and instructor relation
Instructor relation
Department relation
Database Design and Applications (CSI ZG518/ SSZG518) BITS Pilani, Pilani Campus
Ques
• Find the Relation name, attributes, tuples,
cardinality , degree, relation schema in relation
instance.
Database Design and Applications (CSI ZG518/ SSZG518) BITS Pilani, Pilani Campus
Relation name, attributes, tuples,
cardinality , degree, relation schema ??
section
Database Design and Applications (CSI ZG518/ SSZG518) BITS Pilani, Pilani Campus