7 Distributed DB

Uploaded by

dine62611

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

7 Distributed DB

Uploaded by

dine62611

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 38

Chapter

7
Distributed Databases and
Client-Server Architectures

© 2016
Chapter 7 Outline
1. Distributed Database Concepts
2. Data Fragmentation, Replication and Allocation
3. Types of Distributed Database Systems
4. Query Processing
5. Concurrency Control and Recovery
6. 3-Tier Client-Server Architecture
Distributed Database Concepts
 A transaction can be executed by multiple
networked computers in a unified manner.
 A distributed database (DDB) processes Unit of
execution (a transaction) in a distributed manner.
A distributed database (DDB) can be defined as
 A distributed database (DDB) is a collection of
multiple logically related database distributed over
a computer network, and a distributed database
management system as a software system that
manages a distributed database while making the
distribution transparent to the user.
Distributed Database System
 Advantages
 Management of distributed with different
data
levels
 This of transparency:
refers to the physical placement of data (files,
relations, etc.) which is not known to the user
(distribution transparency).
Distributed Database System
 Advantages (transparency, contd.)
 The EMPLOYEE, PROJECT, and WORKS_ON
tables may be fragmented horizontally and stored
with possible replication as shown below.
Distributed Database System
 Advantages (transparency, contd.)
 Distribution and Network transparency:
 Users do not have to worry about operational details of
the network.
 There is Location transparency, which refers to freedom of
issuing command from any location without affecting its
working.
 Then there is Naming transparency, which allows access to
any names object (files, relations, etc.) from any location.
Distributed Database System
 Advantages (transparency, contd.)
 Replication transparency:
 It allows to store copies of a data at multiple sites as
shown in the above diagram.
 This is done to minimize access time to the required
data.
 Fragmentation transparency:
 Allows to fragment a relation horizontally (create a
subset of tuples of a relation) or vertically (create a
subset of columns of a relation).
Distributed Database System
 Other Advantages
 Increased reliability and availability:
 Reliability refers to system live time, that is, system is
running efficiently most of the time.
 Availability is the probability that the system is
continuously available (usable or accessible) during a
time interval.
 A distributed database system has multiple nodes
(computers) and if one fails then others are available
to do the job.
Distributed Database System
 Other Advantages (contd.)
 Improved performance:
 A distributed DBMS fragments the database to keep data
closer to where it is needed most.
 This reduces data management (access and
modification) time significantly.
 Easier expansion (scalability):
 Allows new nodes (computers) to be added anytime
without changing the entire configuration.
Data Fragmentation, Replication and
Allocation
 Data Fragmentation
 Split a relation into logically related and correct
parts.A relation can be fragmented in two ways:
 Horizontal Fragmentation
 Vertical Fragmentation
Data Fragmentation, Replication and
Allocation
 Horizontal fragmentation
 It is a horizontal subset of a relation which contain those of
tuples which satisfy selection conditions.
 Consider the Employee relation with selection condition
(DNO = 5). All tuples satisfy this condition will create a
subset which will be a horizontal fragment of Employee
relation.
 A selection condition may be composed of several
conditions connected by AND or OR.
 Derived horizontal fragmentation: It is the partitioning of a
primary relation to other secondary relations which are related
with Foreign keys.
Data Fragmentation, Replication and
Allocation
 Vertical fragmentation
 It is a subset of a relation which is created by a subset of
columns. Thus a vertical fragment of a relation will
contain values of selected columns. There is no selection
condition used in vertical fragmentation.
 Consider the Employee relation. A vertical fragment of can be
created by keeping the values of Name, Bdate, Sex, and Address.
 Because there is no condition for creating a vertical
fragment, each fragment must include the primary key
attribute of the parent relation Employee. In this
way all
vertical fragments of a relation are connected.
Data Fragmentation, Replication and
Allocation
 Data Replication
 Database is replicated to all sites.
 In full replication the entire database is replicated and in partial
replication some selected part is replicated to some of the sites.
 Data replication is achieved through a replication schema.
 Data Distribution (Data Allocation)
 This is relevant only in the case of partial replication or
partition.
 The selected portion of the database is distributed to the
database sites.

Slide 6- 13
Types of Distributed Database Systems
 Homogeneous
 All sites of the database
system have identical setup,
i.e., same database system
software.
 The underlying operating
system may be different.
 For example, all sites run
Oracle or DB2, or Sybase
or some other database
system.
 The underlying operating
systems can be a mixture of
Linux, Window, Unix, etc.
Types of Distributed Database Systems
 Heterogeneous
 Federated: Each site may run different database system but the data

access is managed through a single conceptual schema.

 This implies that the degree of local autonomy is minimum. Each site must
adhere to a centralized access policy. There may be a global schema.
 Multidatabase: There is no one conceptual global schema. For data
access a schema is constructed dynamically as needed by the
application software.
Query Processing in Distributed
Databases
 Issues
 Cost of transferring data (files and results) over the network.
 This cost is usually high so some optimization is necessary.
 Example relations: Employee at site 1 and Department at Site 2
 Employee at site 1. 10,000 rows. Row size = 100 bytes.
Table size = 106 bytes.

Fname Minit Lname

 Department at Site SSN
2. 100Bdate
rows. Address Sex Salary Superssn
Row size = Dno

35 bytes. Table size = 3,500 bytes.

 Q: For each employee, retrieve
Dname employee
Dnumber name and
Mgrssn Mgrstartdate

department name Where the employee works.

 Q: Fname,Lname,Dname (Employee Dno = Dnumber Department)

Query Processing in Distributed Databases

 Result
 The result of this query will have 10,000 tuples,
assuming that every employee is related to a
department.
 Suppose each result tuple is 40 bytes long.
The query is submitted at site 3 and the result is sent to
this site.
 Problem: Employee and Department relations are not
present at site 3.
Query Processing in Distributed Databases

 Strategies:
1. Transfer Employee and Department to site 3.
 Total transfer bytes = 1,000,000 + 3500 = 1,003,500 bytes.
2. Transfer Employee to site 2, execute join at site 2 and send the
result to site 3.
 Query result size = 40 * 10,000 = 400,000 bytes.
Total transfer size = 400,000 + 1,000,000 =
1,400,000 bytes.
3. Transfer Department relation to site 1, execute the join at site
1, and send the result to site 3.
 Total bytes transferred = 400,000 + 3500 = 403,500 bytes.
 Optimization criteria: minimizing data transfer.
Query Processing in Distributed Databases

 Consider the query

 Q’: For each department, retrieve the department
name and the name of the department manager
 Relational Algebra expression:

Fname,Lname,Dname (Employee Mgrssn = SSN

Department)
Query Processing in Distributed Databases

 The result of this query will have 100 tuples, assuming that
every department has a manager, the execution strategies
are:
1. Transfer Employee and Department to the result site and
perform the join at site 3.
 Total bytes transferred = 1,000,000 + 3500 = 1,003,500
bytes.
2. Transfer Employee to site 2, execute join at site 2 and send
the result to site 3. Query result size = 40 *
100 = 4000 bytes.
 Total transfer size = 4000 + 1,000,000 = 1,004,000 bytes.
3. Transfer Department relation to site 1, execute join at site 1
and send the result to site 3.
 Total transfer size = 4000 + 3500 = 7500 bytes.
Query Processing in Distributed Databases

 Now suppose the result site is 2. Possible

strategies :
1. Transfer Employee relation to site 2, execute the
query and present the result to the user at site 2.
 Total transfer size = 1,000,000 bytes for both
queries Q and Q’.
2. Transfer Department relation to site 1, execute join
at site 1 and send the result back to site 2.
 Total transfer size for Q = 400,000 + 3500 =
403,500 bytes and
 for Q’ = 4000 + 3500 = 7500 bytes.
Query Processing in Distributed
Databases
 Semijoin:
 Objective is to reduce the number of tuples in a relation
before transferring it to another site.
 Example execution of Q or Q’:
1. Project the join attributes of Department at site 2, and transfer
them to site 1. For Q, 4 * 100 = 400 bytes
are transferred and for Q’, 9 * 100 = 900 bytes are transferred.
2. Join the transferred file with the Employee relation at site 1, and
transfer the required attributes from the resulting file to site 2.
For Q, 34 * 10,000 = 340,000 bytes are transferred and
for Q’, 39 * 100 = 3900 bytes are transferred.
3. Execute the query by joining the transferred file with
Department and present the result to the user at site 2.

Slide 6- 24
Concurrency Control and Recovery
 Distributed Databases encounter a number of
concurrency control and recovery problems which are
not present in centralized databases. Some
of them are listed below.
 Dealing with multiple copies of data items
 Failure of individual sites
 Communication link failure
 Distributed commit
 Distributed deadlock
Concurrency Control and Recovery
 Details
 Dealing with multiple copies of data items:
 The concurrency control must maintain global
consistency. Likewise the recovery
mechanism must recover all copies and maintain
consistency after recovery.
 Failure of individual sites:
 Database availability must not be affected due to the
failure of one or two sites and the recovery scheme
must recover them before they are available for use.
Concurrency Control and Recovery
 Details (contd.)
 Communication link failure:
 This failure may create network partition which would affect
database availability even though all database sites may be
running.
 Distributed commit:
 A transaction may be fragmented and they may be executed by a
number of sites. This require a two or three-
phase commit approach for transaction commit.
 Distributed deadlock:
 Since transactions are processed at multiple sites, two or more sites
may get involved in deadlock. This must be resolved in a distributed
manner.
Concurrency Control and Recovery
 Distributed Concurrency control based on a
distributed copy of a data item
 Primary site technique: A single site is
designated as a primary site which serves as a
coordinator for transaction management.
Concurrency Control and Recovery
 Transaction management:
 Concurrency control and commit are managed by this
site.
 In two phase locking, this site manages locking and
releasing data items. If all transactions
follow two-phase policy at all sites, then serializability
is guaranteed.
Concurrency Control and Recovery
 Transaction Management
 Advantages:
 An extension to the centralized two phase locking so
implementation and management is simple.
 Data items are locked only at one site but they can be
accessed at any site.
 Disadvantages:
 All transaction management activities go to primary site which is
likely to overload the site.
 If the primary site fails, the entire system is inaccessible.
 To aid recovery a backup site is designated which behaves as a
shadow of primary site. In case of primary site failure, backup
site can act as primary site.
Concurrency Control and Recovery
 Primary Copy Technique:
 In this approach, instead of a site, a data item partition is
designated as primary copy. To lock a data item just the
primary copy of the data item is locked.
 Advantages:
 Since primary copies are distributed at various sites, a single site
is not overloaded with locking and unlocking requests.
 Disadvantages:
 Identification of a primary copy is complex. A distributed
directory must be maintained, possibly at all sites.
Concurrency Control and Recovery
 Recovery from a coordinator failure
 In both approaches a coordinator site or copy may become
unavailable. This will require the selection of a new
coordinator.
 Primary site approach with no backup site:
 Aborts and restarts all active transactions at all sites.
Elects a new coordinator and initiates transaction processing.
 Primary site approach with backup site:
 Suspends all active transactions, designates the backup site as
the primary site and identifies a new back up site. Primary
site receives all transaction management information to
resume processing.
 Primary and backup sites fail or no backup site:
 Use election process to select a new coordinator site.
Concurrency Control and Recovery
 Concurrency control based on voting:
 There is no primary copy of coordinator.
 Send lock request to sites that have data item.
 If majority of sites grant lock then the requesting
transaction gets the data item.
 Locking information (grant or denied) is sent to all
these sites.
 To avoid unacceptably long wait, a time-out period is
defined. If the requesting transaction does not
get any vote information then the transaction is
aborted.
Client-Server Database Architecture
 It consists of clients running client software, a set of
servers which provide all database functionalities and
a reliable communication infrastructure.
Server 1 Client 1

Client 2

Server 2 Client 3

Server n Client n
Client-Server Database Architecture
 Clients reach server for desired service, but
server does reach clients.
 The server software is responsible for local data
management at a site, much like centralized DBMS
software.
 The client software is responsible for most of the
distribution function.
 The communication software manages
communication among clients and servers.
Client-Server Database Architecture
 The processing of a SQL queries goes as follows:
 Client parses a user query and decomposes it into a
number of independent sub-queries. Each subquery is
sent to appropriate site for execution.
 Each server processes its query and sends the
result to the client.
 The client combines the results of subqueries and
produces the final result.
Recap
 Distributed Database Concepts
 Data Fragmentation, Replication and Allocation
 Types of Distributed Database Systems
 Query Processing
 Concurrency Control and Recovery
 3-Tier Client-Server Architecture
The end of the course !!!
Thank you.

Movie Recommendation Project Report
90% (10)
Movie Recommendation Project Report
30 pages
Distributed Database Concepts
No ratings yet
Distributed Database Concepts
52 pages
Distributed Database Concepts
No ratings yet
Distributed Database Concepts
35 pages
Amazon Rally
No ratings yet
Amazon Rally
2 pages
7-Distributed DB
No ratings yet
7-Distributed DB
37 pages
ch6 Distributed Database
No ratings yet
ch6 Distributed Database
35 pages
Chapter 4 Distributed Databases
No ratings yet
Chapter 4 Distributed Databases
36 pages
Chapter 7 - Distributed Database System
No ratings yet
Chapter 7 - Distributed Database System
27 pages
Data Communication Basics CH 7
No ratings yet
Data Communication Basics CH 7
27 pages
Chapter -7 Distributed Database System
No ratings yet
Chapter -7 Distributed Database System
29 pages
Advanced Database Chapter 6 Distributed database
No ratings yet
Advanced Database Chapter 6 Distributed database
33 pages
Lecture 2 Distriburted Databases
No ratings yet
Lecture 2 Distriburted Databases
45 pages
4.1 Lecture 4 Distributed Databases
No ratings yet
4.1 Lecture 4 Distributed Databases
42 pages
Distributed Database Frank Chinembiri and Florence-2
No ratings yet
Distributed Database Frank Chinembiri and Florence-2
42 pages
DDB Slides
No ratings yet
DDB Slides
30 pages
Ch6-Introduction to Distributed Database (2)
No ratings yet
Ch6-Introduction to Distributed Database (2)
22 pages
DBMS-Unit 5
No ratings yet
DBMS-Unit 5
27 pages
Distributed Database
100% (1)
Distributed Database
24 pages
Week 12- Distributed Databases
No ratings yet
Week 12- Distributed Databases
37 pages
Enterprise Systems: Distributed Databases and Systems - DT211 4
No ratings yet
Enterprise Systems: Distributed Databases and Systems - DT211 4
25 pages
Distributed Databases and Client-Server Architectures
No ratings yet
Distributed Databases and Client-Server Architectures
60 pages
Chapter - 7 Distributed Database System
No ratings yet
Chapter - 7 Distributed Database System
58 pages
Distributed DBM S
No ratings yet
Distributed DBM S
67 pages
Distrubuted Database Concept
No ratings yet
Distrubuted Database Concept
22 pages
Distributed Databases and Client-Server Architectures
No ratings yet
Distributed Databases and Client-Server Architectures
41 pages
A Distributed Database Management System ('DDBMS') Is A Software System
No ratings yet
A Distributed Database Management System ('DDBMS') Is A Software System
5 pages
Topic 7 DDBMS
No ratings yet
Topic 7 DDBMS
28 pages
Chapter 5 - Distributed Databases Roobera
No ratings yet
Chapter 5 - Distributed Databases Roobera
58 pages
Chapter 6 DDBMS
No ratings yet
Chapter 6 DDBMS
41 pages
Advanced Database Chapter 6 and 7
No ratings yet
Advanced Database Chapter 6 and 7
30 pages
Final
No ratings yet
Final
46 pages
ADBS_Chapter_Seven
No ratings yet
ADBS_Chapter_Seven
22 pages
Distributed Databases and Client-Server Architectures
No ratings yet
Distributed Databases and Client-Server Architectures
41 pages
ch6 Distributed Database
No ratings yet
ch6 Distributed Database
25 pages
Chapter 4 - Distributed Database System
No ratings yet
Chapter 4 - Distributed Database System
52 pages
Distributed Databases
No ratings yet
Distributed Databases
46 pages
Chapter 7 - Distributed Database System
No ratings yet
Chapter 7 - Distributed Database System
42 pages
Dd Mid Answers
No ratings yet
Dd Mid Answers
29 pages
Chapter - 7 Distributed Database System
100% (1)
Chapter - 7 Distributed Database System
54 pages
Dbms Unit v Notes 2 27
No ratings yet
Dbms Unit v Notes 2 27
26 pages
Distributed Database Management Systems
No ratings yet
Distributed Database Management Systems
123 pages
Distributed Database System
No ratings yet
Distributed Database System
4 pages
Distributed Db New
No ratings yet
Distributed Db New
44 pages
Distributed Database: Source
No ratings yet
Distributed Database: Source
19 pages
Distributed Databases: Centralized Database System Distributed Database System Advantages and Disadvantages of DDBMS
No ratings yet
Distributed Databases: Centralized Database System Distributed Database System Advantages and Disadvantages of DDBMS
26 pages
Distributed Databases: Benefits and Issues To Be Considered
No ratings yet
Distributed Databases: Benefits and Issues To Be Considered
25 pages
Unit V NoSQL Databases
No ratings yet
Unit V NoSQL Databases
124 pages
BIT - University of Colombo - Fundamentals of DB Systems
No ratings yet
BIT - University of Colombo - Fundamentals of DB Systems
41 pages
ddb unit 1-5
No ratings yet
ddb unit 1-5
190 pages
Adt Unit I
No ratings yet
Adt Unit I
18 pages
dbms-unit-v
No ratings yet
dbms-unit-v
27 pages
Unit V
No ratings yet
Unit V
22 pages
Distributed Data Management: Distributed Systems Department of Computer Science UC Irvine
No ratings yet
Distributed Data Management: Distributed Systems Department of Computer Science UC Irvine
67 pages
Dbms Unit V Notes
No ratings yet
Dbms Unit V Notes
27 pages
Chapter-7 Distributed Database Systems
No ratings yet
Chapter-7 Distributed Database Systems
40 pages
DDB Slides
No ratings yet
DDB Slides
67 pages
Assignment # 2: Submitted by Submitted To Class Semester Roll No
No ratings yet
Assignment # 2: Submitted by Submitted To Class Semester Roll No
9 pages
Adb CH 4
No ratings yet
Adb CH 4
14 pages
Query
No ratings yet
Query
13 pages
Adv DBMS-Unit 2
No ratings yet
Adv DBMS-Unit 2
15 pages
Database MC A
No ratings yet
Database MC A
16 pages
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
From Everand
THE SQL LANGUAGE: Master Database Management and Unlock the Power of Data (2024 Beginner's Guide)
JAMIE POWERS
No ratings yet
6.1.2.12 Lab - Manage Virtual Memory in Windows 7 and Vista
No ratings yet
6.1.2.12 Lab - Manage Virtual Memory in Windows 7 and Vista
13 pages
CSE Model Exit Exam
100% (1)
CSE Model Exit Exam
26 pages
MCQ
No ratings yet
MCQ
48 pages
Acc 2ch 5
No ratings yet
Acc 2ch 5
12 pages
Chapter-2 (WMC) (1) Wireless Communication
No ratings yet
Chapter-2 (WMC) (1) Wireless Communication
59 pages
GUI Stands For
No ratings yet
GUI Stands For
22 pages
Chapter On1 MIS
No ratings yet
Chapter On1 MIS
25 pages
OT NEW CH 3 New
No ratings yet
OT NEW CH 3 New
13 pages
Leadership and Change MGMT CH 1-5
No ratings yet
Leadership and Change MGMT CH 1-5
31 pages
Chapter One-Blaw
No ratings yet
Chapter One-Blaw
12 pages
Syllabus
No ratings yet
Syllabus
256 pages
Avantis Firmware Update Instructions IssB
No ratings yet
Avantis Firmware Update Instructions IssB
1 page
BCSL 058 PDF
No ratings yet
BCSL 058 PDF
22 pages
Walkthrough 2048
No ratings yet
Walkthrough 2048
6 pages
Excel VBA String Functions
No ratings yet
Excel VBA String Functions
17 pages
Sybdb2 Setup Hadr - Rs
No ratings yet
Sybdb2 Setup Hadr - Rs
6 pages
Bidding Documents
No ratings yet
Bidding Documents
20 pages
vpentest-msp-info
No ratings yet
vpentest-msp-info
15 pages
CA Backup
No ratings yet
CA Backup
62 pages
The Online Megastore: Submitted by Submitted To
No ratings yet
The Online Megastore: Submitted by Submitted To
40 pages
Design and Development of Microcontroller Based Digital Bangla Clock
No ratings yet
Design and Development of Microcontroller Based Digital Bangla Clock
4 pages
Instant Access to Sustainable Communication Networks and Application Proceedings of ICSCN 2020 1st Edition P. Karuppusamy ebook Full Chapters
100% (2)
Instant Access to Sustainable Communication Networks and Application Proceedings of ICSCN 2020 1st Edition P. Karuppusamy ebook Full Chapters
65 pages
Socidoc - Us - Engtraining Material Magicad For Autocadventilation 20134 PDF
No ratings yet
Socidoc - Us - Engtraining Material Magicad For Autocadventilation 20134 PDF
64 pages
Cisco Oam Cheat-Sheet PDF
No ratings yet
Cisco Oam Cheat-Sheet PDF
1 page
An Empirical Model To Calculate The Threads Stripping of A Bolt Installed in A Tapped Part
No ratings yet
An Empirical Model To Calculate The Threads Stripping of A Bolt Installed in A Tapped Part
4 pages
Practical 4 Asset Transfer App
No ratings yet
Practical 4 Asset Transfer App
8 pages
Lab 3 Impedance Matching
No ratings yet
Lab 3 Impedance Matching
16 pages
1 Uoc Eir v1.2.1
No ratings yet
1 Uoc Eir v1.2.1
80 pages
GEAR 6 Integration Patterns Overview
No ratings yet
GEAR 6 Integration Patterns Overview
16 pages
User Guide CC2018-up
No ratings yet
User Guide CC2018-up
1 page
BPLCK105B Introduction To Python Programming: Question Bank
No ratings yet
BPLCK105B Introduction To Python Programming: Question Bank
2 pages
Research Paper for Computer Science Topics
100% (1)
Research Paper for Computer Science Topics
7 pages
AdvanceJS in Hindi
No ratings yet
AdvanceJS in Hindi
137 pages
SCADA Paper
No ratings yet
SCADA Paper
15 pages
OptiPlex 3070 Tower and Small Form Factor Desktops - Dell Malaysia PDF
No ratings yet
OptiPlex 3070 Tower and Small Form Factor Desktops - Dell Malaysia PDF
26 pages
Python Regex Cheatsheet
No ratings yet
Python Regex Cheatsheet
1 page
CE-207 Computer Organization and Architecture - Batch 2019 - 04-07-2020
No ratings yet
CE-207 Computer Organization and Architecture - Batch 2019 - 04-07-2020
5 pages
Joseph Cheesman Thompson - Prodrome of A Description of A New Genus of Ranidae From The Loo Choo Islands (1912 Jún. 15)
No ratings yet
Joseph Cheesman Thompson - Prodrome of A Description of A New Genus of Ranidae From The Loo Choo Islands (1912 Jún. 15)
83 pages

7 Distributed DB

Uploaded by

7 Distributed DB

Uploaded by

Chapter

access is managed through a single conceptual schema.

Fname Minit Lname

35 bytes. Table size = 3,500 bytes.

department name Where the employee works.

 Q: Fname,Lname,Dname (Employee Dno = Dnumber Department)

 Consider the query

 Now suppose the result site is 2. Possible

You might also like