0% found this document useful (0 votes)
90 views

Types of Distributed Data Base System_49724

A Distributed Database Management System (DDBMS) allows databases to be spread across multiple sites, providing a unified interface for users. There are two main types of distributed databases: homogeneous, where all sites use identical systems, and heterogeneous, where different systems are used across sites. Data can be stored through replication or fragmentation, each with its own advantages and challenges, impacting data consistency and query processing.

Uploaded by

Kanishk Chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
90 views

Types of Distributed Data Base System_49724

A Distributed Database Management System (DDBMS) allows databases to be spread across multiple sites, providing a unified interface for users. There are two main types of distributed databases: homogeneous, where all sites use identical systems, and heterogeneous, where different systems are used across sites. Data can be stored through replication or fragmentation, each with its own advantages and challenges, impacting data consistency and query processing.

Uploaded by

Kanishk Chauhan
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 37

DISTRIBUTED DATA

BASE SYSTEM
 What is DDBMS?
 A distributed database is basically a database that
is not limited to one system, it is spread over
different sites, i.e, on multiple computers or over a
network of computers.
 A distributed database system is located on
various sited that don’t share physical
components. This maybe required when a
particular database needs to be accessed by
various users globally.
 It needs to be managed such that for the users it
looks like one single database.
 Types:
1. Homogeneous Database:
 In a homogeneous database, all different
sites store database identically. The
operating system, database
management system and the data
structures used – all are same at all
sites. Hence, they’re easy to manage.
 2. Heterogeneous Database:
 In a heterogeneous distributed database, different
sites can use different schema and software that
can lead to problems in query processing and
transactions.
 Also, a particular site might be completely
unaware of the other sites. Different computers
may use a different operating system, different
database application.
 They may even use different data models for the
database. Hence, translations are required for
different sites to communicate.
 Distributed Data Storage
 There are 2 ways in which data can be stored on
different sites. These are:
1. Replication
In this approach, the entire relation is stored
redundantly at 2 or more sites. If the entire database is
available at all sites, it is a fully redundant database.
 Hence, in replication, systems maintain copies of data.
This is advantageous as it increases the availability of
data at different sites.
 Also, now query requests can be processed in parallel.
However, it has certain disadvantages as well. Data
needs to be constantly updated.
 Any change made at one site needs to
be recorded at every site that relation is
stored or else it may lead to
inconsistency. This is a lot of overhead.
 Also, concurrency control becomes way
more complex as concurrent access now
needs to be checked over a number of
sites.
 2. Fragmentation
In this approach, the relations are
fragmented (i.e., they’re divided into
smaller parts) and each of the fragments
is stored in different sites where they’re
required. It must be made sure that the
fragments are such that they can be
used to reconstruct the original relation
(i.e, there isn’t any loss of data).
 Fragmentation is advantageous as it
doesn’t create copies of data,
consistency is not a problem.
Fragmentation of relations can be done
in two ways:
 Horizontal fragmentation – Splitting by
rows – The relation is fragmented into
groups of tuples so that each tuple is
assigned to at least one fragment.
 Vertical fragmentation – Splitting by
columns – The schema of the relation is
divided into smaller schemas. Each
fragment must contain a common
candidate key so as to ensure lossless
join.
 In certain cases, an approach that is
hybrid of fragmentation and replication
is used.
Types of Distributed Databases

 Distributed databases can be broadly


classified into homogeneous and
heterogeneous distributed database
environments, each with further sub-
divisions, as shown in the following
illustration.
 Homogeneous Distributed Databases
 In a homogeneous distributed database, all the
sites use identical DBMS and operating systems.
Its properties are −
 The sites use very similar software.
 The sites use identical DBMS or DBMS from the
same vendor.
 Each site is aware of all other sites and cooperates
with other sites to process user requests.
 The database is accessed through a single
interface as if it is a single database.
 Types of Homogeneous Distributed Database
 There are two types of homogeneous
distributed database −
 Autonomous − Each database is independent
that functions on its own. They are integrated
by a controlling application and use message
passing to share data updates.
 Non-autonomous − Data is distributed across
the homogeneous nodes and a central or
master DBMS co-ordinates data updates across
the sites.
 Heterogeneous Distributed Databases
 In a heterogeneous distributed database, different sites
have different operating systems, DBMS products and
data models. Its properties are −
 Different sites use dissimilar schemas and software.
 The system may be composed of a variety of DBMSs
like relational, network, hierarchical or object oriented.
 Query processing is complex due to dissimilar schemas.
 Transaction processing is complex due to dissimilar
software.
 A site may not be aware of other sites and so there is
limited co-operation in processing user requests.
 Types of Heterogeneous Distributed
Databases
 Federated − The heterogeneous database
systems are independent in nature and
integrated together so that they function as
a single database system.
 Un-federated − The database systems
employ a central coordinating module
through which the databases are accessed.
 Federated database management system
issues
 Last Updated: 08-10-2018A system in which
each server is autonomous and centralized
DBMS that has its own local users. The term
Federated Database system or in short FDS is
basically used when there is some global view
or schema of the Federation of the database
which is basically shared by the applications.
These systems are hybrid between distributed
and centralized systems.
 Issues in DBMS –
In heterogeneous FDBMS one server may be
network DBMS another an object DBMS and a
third a relational or hierarchical DBMS in such
cases we may need to have canonical language
system and which include language translators
to translate subqueries from the canonical
language to the language of the server. The
type of heterogeneity present in FDBMS may
arise basically from several sources. Following
types of Heterogeneity or Issues will occur in
FDBMS.
 Differences in data model –
In an organization, we may have
different types of the data model for
databases such as relational, file, object
data model and modeling capabilities of
these models vary from one another.
Hence to deal with them uniformly in a
single language is too challenging.
Hence Difference in the data model is
the basic issue in FDBMS.
 Difference in Constraints –
Constraints facilities and its implementation
vary from one system to another. There are
basically comparable features that must be
reconciled in the basic construction of global
schema. And this global schema also has to
deal with potential conflicts among
constraints. For example, the relationship
from the ER model is represented as
referential integrity constraint in the
relational model.
 Difference in Query Language –
For the same data model, we have so
many languages and their version also
varies. For example, even in SQL, we
have so many versions such as SQL-89,
SQL-92 and SQL-99 and these versions
have their own set of data types,
comparison operators, string
manipulation and so on.
 Distributed DBMS Architectures
 DDBMS architectures are generally developed
depending on three parameters −
 Distribution − It states the physical distribution
of data across the different sites.
 Autonomy − It indicates the distribution of control
of the database system and the degree to which
each constituent DBMS can operate independently.
 Heterogeneity − It refers to the uniformity or
dissimilarity of the data models, system
components and databases.
 Architectural Models
 Some of the common architectural
models are
 Client - Server Architecture for DDBMS
 Peer - to - Peer Architecture for DDBMS
 Multi - DBMS Architecture
 Client - Server Architecture for DDBMS
 This is a two-level architecture where the functionality
is divided into servers and clients. The server
functions primarily encompass data management,
query processing, optimization and transaction
management. Client functions include mainly user
interface. However, they have some functions like
consistency checking and transaction management.
 The two different client - server architecture are −
 Single Server Multiple Client
 Multiple Server Multiple Client (shown in the following
diagram)
 Peer- to-Peer Architecture for DDBMS
 In these systems, each peer acts both as a client and a
server for imparting database services. The peers share
their resource with other peers and co-ordinate their
activities.
 This architecture generally has four levels of schemas −
 Global Conceptual Schema − Depicts the global logical
view of data.
 Local Conceptual Schema − Depicts logical data
organization at each site.
 Local Internal Schema − Depicts physical data
organization at each site.
 External Schema − Depicts user view of data.
 Multi - DBMS Architectures
 This is an integrated database system formed by a
collection of two or more autonomous database
systems.
 Multi-DBMS can be expressed through six levels of
schemas −
 Multi-database View Level − Depicts multiple
user views comprising of subsets of the integrated
distributed database.
 Multi-database Conceptual Level − Depicts
integrated multi-database that comprises of global
logical multi-database structure definitions.
 Multi-database Internal Level − Depicts the data
distribution across different sites and multi-database
to local data mapping.
 Local database View Level − Depicts public view of
local data.
 Local database Conceptual Level − Depicts local
data organization at each site.
 Local database Internal Level − Depicts physical
data organization at each site.
 There are two design alternatives for multi-DBMS −
 Model with multi-database conceptual level.
 Model without multi-database conceptual level.
 Design Alternatives
 The distribution design alternatives for
the tables in a DDBMS are as follows −
 Non-replicated and non-fragmented
 Fully replicated
 Partially replicated
 Fragmented
 Mixed
 Non-replicated & Non-fragmented
 In this design alternative, different tables are
placed at different sites. Data is placed so that
it is at a close proximity to the site where it is
used most. It is most suitable for database
systems where the percentage of queries
needed to join information in tables placed at
different sites is low. If an appropriate
distribution strategy is adopted, then this
design alternative helps to reduce the
communication cost during data processing.
 Fully Replicated
 In this design alternative, at each site, one copy
of all the database tables is stored. Since, each
site has its own copy of the entire database,
queries are very fast requiring negligible
communication cost. On the contrary, the
massive redundancy in data requires huge cost
during update operations. Hence, this is
suitable for systems where a large number of
queries is required to be handled whereas the
number of database updates is low.
 Partially Replicated
 Copies of tables or portions of tables are
stored at different sites. The distribution of
the tables is done in accordance to the
frequency of access. This takes into
consideration the fact that the frequency of
accessing the tables vary considerably from
site to site. The number of copies of the
tables (or portions) depends on how
frequently the access queries execute and
the site which generate the access queries.
 Fragmented
 In this design, a table is divided into two or
more pieces referred to as fragments or
partitions, and each fragment can be stored
at different sites. This considers the fact that
it seldom happens that all data stored in a
table is required at a given site. Moreover,
fragmentation increases parallelism and
provides better disaster recovery. Here, there
is only one copy of each fragment in the
system, i.e. no redundant data.
 The three fragmentation techniques are

 Vertical fragmentation
 Horizontal fragmentation
 Hybrid fragmentation
 Mixed Distribution
 This is a combination of fragmentation
and partial replications. Here, the tables
are initially fragmented in any form
(horizontal or vertical), and then these
fragments are partially replicated across
the different sites according to the
frequency of accessing the fragments.

You might also like