0% found this document useful (0 votes)
5 views

2.Database System Concepts and Architecture

Uploaded by

Vishu Shah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

2.Database System Concepts and Architecture

Uploaded by

Vishu Shah
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 54

Database System Concepts

and Architecture
Data Models
A data model—a collection of concepts that can be used to describe the
structure of a database.

Categories of Data Models :


1. Conceptual data models (or High level data models)
2. Representational or implementation data models
3. Physical data models (or low level data models)
1. Conceptual data models
• It is a high level data model.
• It is use to perceive the data by users.
• Conceptual data models use concepts such as entities, attributes, and
relationships.
• An entity represents a real-world object or concept, such as an employee
or a project from the miniworld that is described in the database.
An attribute represents some property of interest that further describes an
entity, such as the employee’s name or salary. A relationship among two or
more entities represents an association among the entities.
• Entity-Relationship model—a popular high-level conceptual data model.
2. Representational or implementation data models
• These models are called middle level data models.
• These models used most frequently in traditional commercial DBMSs.
• These include the widely used relational data model, network models
and hierarchical models.

3. Physical data models


• These are the low level data models.
• It describe how data is stored as files in the computer by representing
information such as record formats, record orderings, and access paths.
• An access path is a structure that makes the search for particular database
records efficient.
• An index is an example of an access path that allows direct access to data
using an index term or a keyword.
Schemas, Instances, and Database State
In any data model, it is important to distinguish between the description of the
data-base and the database itself. The description of a database is called
the database schema, which is specified during database design and is not
expected to change frequently.

Figure shows a schema diagram for the database:


The actual data in a database may change quite frequently. For example, the
data base shown in Figure changes every time we add a new student or enter
a new grade. The data in the database at a particular moment in time is called
a database state or snapshot. It is also called the current set of occurrences
or instances in the database.
For Example,
This Table shows the 3 instances of table name “Student” in the database.
Name Student_Number Class Major

Rima 101 A BCA

Tina 102 A BBA

Sima 103 B BCA


• The data in the database at a particular moment in time is called
a database state or snapshot.
• When we define a new database, we specify its database schema only to
the DBMS. At this point, the corresponding database state is the empty
state with no data.
• We get the initial state of the database when the database is
first populated or loaded with the initial data.
• From then on, every time an update operation is applied to the database,
we get another database state. At any point in time, the database has
a current state.
• The schema is sometimes called the intension, and a database state is
called an extension of the schema.
Three-Schema Architecture
The goal of the three-schema architecture, is to separate the user applications
from the physical database. Following figure shows the Three-Schema
architecture.
• In this architecture, schemas can be defined at three levels.

• The internal level has an internal schema, which describes the physical
stor-age structure of the database. The internal schema uses a physical data
model and describes the complete details of data storage and access paths
for the database.

• The conceptual level has a conceptual schema, which describes the


structure of the whole database for a community of users. The conceptual
schema hides the details of physical storage structures and concentrates on
describing entities, data types, relationships, user operations, and
constraints. Usually, a representational data model is used to describe the
conceptual schema when a database system is implemented.
This implementation conceptual schema is often based on a conceptual
schema design in a high-level data model.
The external or view level includes a number of external
schemas or user views. Each external schema describes the part of the
database that a particular user group is interested in and hides the rest of the
database from that user group. As in the previous level, each external schema
is typically implemented using a representational data model, possibly based
on an external schema design in a high-level data model.
Data Independence
The three-schema architecture can be used to further explain the concept
of data independence, which can be defined as the capacity to change the
schema at one level of a database system without having to change the
schema at the next higher level.
We can define two types of data independence:
1. Logical data independence
• Logical data independence is the capacity to change the conceptual
schema without having to change external schemas or application programs.
• We may change the conceptual schema to expand the database (by adding a record
type or data item), to change constraints, or to reduce the database (by removing a
record type or data item).
• In the last case, external schemas that refer only to the remaining data should not be
affected.
2. Physical data independence

• Physical data independence is the capacity to change the internal


schema without having to change the conceptual schema. Hence, the
external schemas need not be changed as well.
• Changes to the internal schema may be needed because some physical files
were reorganized—for example, by creating additional access structures—
to improve the performance of retrieval or update.
• If the same data as before remains in the database, we should not have to
change the conceptual schema.
• Generally, physical data independence exists in most databases and file
environments where physical details such as the exact location of data on
disk, and hard-ware details of storage encoding, placement, compression,
splitting, merging of records, and so on are hidden from the user.
Note : On the other hand, logical data independence is harder to achieve
because it allows structural and constraint changes without affecting
application programs—a much stricter requirement.
• The three-schema architecture can make it easier to achieve true data
independence, both physical and logical.
Advantages of Data Independences
• Flexibility
• Application Compatibility
• Easier Maintenance
• Enhanced Security
• Data Continuity
• Scalability
• Reduced Development Time
• Ease of Integration
• Data Integrity
• Adaptation to Technology Changes
• Reduced Risk
DBMS Languages
DDL

• Once the design of a database is completed and a DBMS is chosen to


implement the database, the first step is to specify conceptual and
internal schemas for the database and any mappings between the two.
• In many DBMSs where no strict separation of levels is maintained, one
language, called the data definition language (DDL), is used by the
DBA and by database designers to define both schemas.
• The DBMS will have a DDL compiler whose function is to process
DDL statements in order to identify descriptions of the schema
constructs and to store the schema description in the DBMS catalog.
SDL

In DBMSs where a clear separation is maintained between the conceptual


and internal levels, the DDL is used to specify the conceptual schema only.
Another language, the storage definition language (SDL), is used to
specify the internal schema.

VDL

The internal schema is specified by a combination of functions,


parameters, and specifications related to storage. These permit the DBA
staff to control indexing choices and mapping of data to storage. For a true
three-schema architecture, we would need a third language, the view
definition language (VDL), to specify user views and their mappings to
the conceptual schema.
DML

Once the database schemas are compiled and the database is populated with
data, users must have some means to manipulate the database. Typical
manipulations include retrieval, insertion, deletion, and modification of the
data. The DBMS pro-vides a set of operations or a language called the data
manipulation language (DML) for these purposes.

There are two main types of DMLs.


1.A high-level or nonprocedural DML
2. A low-level or procedural DML
High-level or nonprocedural

• High-level or nonprocedural DML can be used on its own to specify


complex database operations concisely.
• Many DBMSs allow high-level DML statements either to be entered
interactively from a display monitor or terminal or to be embedded in a
general-purpose programming language.
• In the latter case, DML statements must be identified within the program so
that they can be extracted by a precompiler and processed by the DBMS.
• High-level DMLs, such as SQL, can specify and retrieve many records in a
single DML statement; therefore, they are called set-at-a-time or set-
oriented DMLs.
• A query in a high-level DML often specifies which data to retrieve rather
than how to retrieve it; therefore, such languages are also
called declarative.
Low-level or procedural

• DML must be embedded in a general-purpose programming language.


This type of DML typically retrieves individual records or objects from the
database and processes each separately.
• Therefore, it needs to use programming language constructs, such as
looping, to retrieve and process each record from a set of records.
• Low-level DMLs are also called record-at-a-time DMLs because of this
property.
• DML commands, whether high level or low level, are embedded in a
general-purpose programming language, that language is called the host
language and
• the DML is called the data sublanguage.
• High-level DML used in a standalone interactive manner is called
a query language.
DBMS Interfaces
Menu-Based Interfaces
• Menu-Based interfaces for Web Clients or Browsing. These interfaces pre-
sent the user with lists of options (called menus) that lead the user through
the formulation of a request.
• Menus do away with the need to memorize the specific commands and
syntax of a query language; rather, the query is composed step-by-step by
picking options from a menu that is displayed by the system.
• Pull-down menus are a very popular technique in Web-based user
interfaces.
• They are also often used in browsing interfaces, which allow a user to
look through the contents of a database in an exploratory and unstructured
manner.
Forms-Based Interfaces
• A forms-based interface displays a form to each user.
• Users can fill out all of the form entries to insert new data, or they can fill
out only certain entries, in which case the DBMS will retrieve matching
data for the remaining entries. Forms are usually designed and
programmed for naive users as inter-faces to canned transactions.
• Many DBMSs have forms specification languages, which are special
languages that help programmers specify such forms. SQL*Forms is a
form-based language that specifies queries using a form designed in
conjunction with the relational database schema.
• Oracle Forms is a component of the Oracle product suite that provides an
extensive set of features to design and build applications using forms.
• Some systems have utilities that define a form by letting the end user
interactively construct a sample form on the screen.
Graphical User Interfaces
• A GUI typically displays a schema to the user in diagrammatic form. The
user then can specify a query by manipulating the diagram.
• In many cases, GUIs utilize both menus and forms.
• Most GUIs use a pointing device, such as a mouse, to select certain parts
of the displayed schema diagram.
Natural Language Interfaces
• These interfaces accept requests written in English or some other
language and attempt to understand them.
• A natural language interface usually has its own schema, which is similar
to the database conceptual schema, as well as a dictionary of important
words.
• The natural language interface refers to the words in its schema, as well as
to the set of standard words in its dictionary, to interpret the request.
• If the interpretation is successful, the inter-face generates a high-level
query corresponding to the natural language request and submits it to the
DBMS for processing; otherwise, a dialogue is started with the user to
clarify the request.
Speech Input and Output

• Limited use of speech as an input query and speech as an answer to a


question or result of a request is becoming commonplace.
• Applications with limited vocabularies such as inquiries for telephone
directory, flight arrival/departure, and credit card account information are
allowing speech for input and output to enable customers to access this
information.
• The speech input is detected using a library of predefined words and used
to set up the parameters that are supplied to the queries.
• For output, a similar conversion from text or numbers into speech takes
place.
Interfaces for Parametric Users
• Parametric users, such as bank tellers, often have a small set of
operations that they must perform repeatedly.
• For example, a teller is able to use single function keys to invoke
routine and repetitive transactions such as account deposits or
withdrawals, or balance inquiries.
• Systems analysts and programmers design and implement a special
interface for each known class of naive users.
• Usually a small set of abbreviated commands is included, with the
goal of minimizing the number of keystrokes required for each
request.
• For example, function keys in a terminal can be programmed to
initiate various commands.
• This allows the parametric user to proceed with a minimal number of
keystrokes.
Interfaces for the DBA

• Most database systems contain privileged commands that can be used


only by the DBA staff.
• These include commands for creating accounts, setting system
parameters, granting account authorization, changing a schema, and
reorganizing the storage structures of a database.
The Database System Environment

A DBMS is a complex software system. The Database System Environment


consists of a DBMS components and a software which interacts with DBMS
components.

1. DBMS Component Modules


2. Database System Utilities
3. Tools, Application Environments, and Communications Facilities
1. DBMS Component Modules
• The figure is divided into two parts. The top part of the figure refers to the
various users of the database environment and their interfaces.
• The lower part shows the internals of the DBMS responsible for storage of
data and processing of transactions.
• The database and the DBMS catalog are usually stored on disk. Access to
the disk is controlled primarily by the operating system (OS).
• Figure shows interfaces for the DBA staff, casual users who work with
interactive interfaces to formulate queries called interactive query,
application programmers who create programs using some host
programming languages, and parametric users who do data entry work
by supplying parameters to predefined transactions.
• The DBA staff works on defining the database and tuning it by making
changes to its definition using the DDL and other privileged commands.
• The DDL compiler processes schema definitions, specified in the DDL,
and stores descriptions of the schemas (meta-data) in the DBMS catalog.
The catalog includes information such as the names and sizes of files,
names and data types of data items, storage details of each file, mapping
information among schemas, and constraints.

• Queries are parsed and validated for correctness of the query syntax, the
names of files and data elements, and so on by a query compiler that
compiles them into an internal form.

• This internal query is subjected to query optimization which is concerned


with the rearrangement and possible reordering of operations, elimination
of redundancies, and use of correct algorithms and indexes during
execution.
• Application programmers write programs in host languages such as Java,
C, or C++ that are submitted to a precompiler. The precompiler extracts
DML commands from an application program written in a host
programming language.

• These commands are sent to the DML compiler for compilation into object
code for database access. The rest of the program is sent to the host
language compiler.

• In the lower part of Figure shows the runtime database


processor executes (1) the privileged commands, (2) the executable query
plans, and (3) the canned transactions with runtime parameters
• It works with the system catalog and may update it with statistics.
• It also works with the stored data manager, which in turn uses basic
operating system services for carrying out low-level input/output
(read/write) operations between the disk and main memory.
• We have shown concurrency control and backup and recovery
systems separately as a module in this figure.
• They are integrated into the working of the runtime database processor
for purposes of transaction management.
2. Database System Utilities
DBMSs have database utilities that help the DBA manage the database
system. Common utilities have the following types of functions:
1. Loading
• A loading utility is used to load existing data files—such as text files or
sequential files—into the database.
• Usually, the current (source) for mat of the data file and the desired (target)
database file structure are specified to the utility, which then automatically
reformats the data and stores it in the database.
• With the DBMSs, transferring data from one DBMS to another is becoming
common in many organizations.
• Some vendors are offering products that generate the appropriate loading
programs, given the existing source and target database storage descriptions
(internal schemas). Such tools are also called conversion tools.
2. Backup

• A backup utility creates a backup copy of the database, usually by dumping


the entire database onto tape or other mass storage medium.
• The backup copy can be used to restore the database in case of catastrophic
disk failure. Incremental backups are also often used, where only changes
since the previous backup are recorded.
• Incremental backup is more complex, but saves storage space.

3. Database storage reorganization

• This utility can be used to reorganize a set of database files into different
file organizations, and create new access paths to improve performance.
4. Performance monitoring

• Such a utility monitors database usage and provides statistics to the DBA.
• The DBA uses the statistics in making decisions such as whether or not to reorganize
files or whether to add or drop indexes to improve performance.

Other utilities may be available for sorting files, handling data


compression, monitoring access by users, interfacing with the network, and
performing other functions.
Centralized Architectures for DBMSs
• Architectures for DBMSs have followed trends similar to those for general
computer system architectures.
• Older architectures used mainframe computers to pro vide the main
processing for all system functions, including user application programs
and user interface programs, as well as all the DBMS functionality.
• The reason was that in older systems, most users accessed the DBMS via
computer terminals that did not have processing power and only provided
display capabilities.
• Therefore, all processing was performed remotely on the computer system
housing the DBMS, and only display information and controls were sent
from the computer to the display terminals, which were connected to the
central computer via various types of communications networks.
• As prices of hardware declined, most users replaced their terminals with PCs
and workstations, and more recently with mobile devices.
• At first, database systems used these computers similarly to how they had
used display terminals, so that the DBMS itself was still a centralized DBMS
in which all the DBMS functionality, application program execution, and
user interface processing were carried out on one machine.
• Following Figure show the physical components in a centralized
architecture.
• Gradually, DBMS systems started to exploit the available processing power
at the user side, which led to client/server DBMS architectures.
Basic Client/Server Architectures
• The client/server architecture was developed to deal with com puting
environments in which a large number of PCs, workstations, file servers,
printers, database servers, Web servers, e-mail servers, and other software
and equipment are connected via a network.
• The idea is to define specialized servers with specific functionalities.
• For example, it is possible to connect a number of PCs or small
workstations as clients to a file server that maintains the files of the client
machines.
• Another machine can be designated as a printer server by being connected
to various printers; all print requests by the clients are forwarded to this
machine.
• Web servers or e-mail servers also fall into the specialized server cate gory.
The resources provided by specialized servers can be accessed by many
client machines.
• The client machines provide the user with the appropriate interfaces to utilize
these servers, as well as with local processing power to run local
applications.
• This concept can be carried over to other software packages, with specialized
programs—such as a CAD (computer-aided design) package—being stored
on specific server machines and being made accessible to multiple clients.
• Basically The concept of client/server architecture assumes an underlying
framework that consists of many PCs/workstations and mobile devices as
well as a smaller number of server machines, connected via wireless
networks or LANs and other types of computer networks.
• A client in this framework is typically a user machine that pro vides user
interface capabilities and local processing.
• A server is a sys tem containing both hardware and software that can provide
services to the client machines, such as file access, printing, archiving, or
database access.
1)Two Tier Client/Server Architecture for DBMS:
• Here Two-tier means that our Architecture has two layers, which are client
layer and Data layer.
• In Client layer we have several Client machines which can have the access
to the database server.
• The API present on the client machine will establish the connection
between the machine and the Database server through JDBC something
else.
• This is because Clients and Database Server may be at different locations.
• Once this connection gets established, the Interface present on the client
machine contains an Application Program on the back-side which contains
a query.
• This query will be processed by the Database server and in turn the
queried information will be sent to the client machine.
Logical 2-tire Client-Server Architecture Physical 2-tire Client-Server Architecture
2) Three-Tier client/server Architecture for DBMS:
• Here there is an additional layer which acts as an intermediate between
Client layer and Data layer called Business logic layer.
• Business logic layer is the layer where the Application Programs are
processed.
• Simply the Client machines will contact Application Server which in turn
processes our Application Programs and fetches the Required Data from
Database and then sends this Information back to the client machine in
the suitable format only.
Why 3-Tier Architecture is far better then 2-Tier Architecture?
• Three-Tier Architecture is Scalable and more secured.

• Even it is easy to maintain Two-Tier Architecture of DBMS it is still not


scalable when we have large number of clients and also not secure because
the clients are having direct access to database server.
• But Three-Tier Architecture ensures Scalability and Security of the data
because of the presence of this Intermediate layer which processes the
queries and it just retrieves data from server instead of processing in the
server to take place.
Classification of Database Management Systems
Database Management Systems (DBMS) can be classified based on
various criteria, including data models, number of users, database
distribution, and use case scenarios.
Based on Data Models

1.) Hierarchical DBMS:


•Organizes data in a tree-like structure.
•Each record has a single parent and multiple children.
•Example: IBM Information Management System (IMS).
2.) Network DBMS:
•Organizes data in a graph structure allowing many-to-many relationships.
•More flexible than hierarchical DBMS.
•Example: Integrated Data Store (IDS).
3.) Relational DBMS (RDBMS):
•Organizes data in tables (relations) with rows and columns.
•Uses Structured Query Language (SQL) for data manipulation.
•Examples: MySQL, PostgreSQL, Oracle, Microsoft SQL Server.

4.) Object-oriented DBMS (OODBMS):


•Stores data as objects, similar to object-oriented programming.
•Supports complex data types and inheritance.
•Examples: db4o, ObjectDB.

5.) Object-relational DBMS (ORDBMS):


•Combines features of RDBMS and OODBMS.
•Supports both relational tables and object-oriented features.
•Examples: PostgreSQL, Oracle.tim

6.) NoSQL DBMS:


Designed for unstructured or semi-structured data.
Examples include document stores (MongoDB), key-value stores (Redis), column-family stores
(Cassandra), and graph databases (Neo4j).
Based on Number of Users

1.) Single-user DBMS:


•Supports one user at a time.
•Typically used on personal computers.
•Example: Microsoft Access.

2.) Multi-user DBMS:


•Supports multiple users simultaneously.
•Manages concurrent access and maintains data integrity.
•Examples: MySQL, PostgreSQL, Oracle.
Based on Database Distribution

1.) Centralized DBMS:


•Data is stored and managed on a single central server.
•Users connect to the central server to access data.
•Example: Mainframe DBMS.

2.) Distributed DBMS (DDBMS):


•Data is distributed across multiple sites or servers.
•Ensures data consistency and coordination among sites.
•Examples: Google Spanner, Amazon Aurora.

3.) Federated DBMS:


•Integrates multiple autonomous databases into a single federated database.
•Each database maintains its autonomy.
•Example: Microsoft SQL Server with Linked Servers.
Based on Use Case Scenarios

1.) OLTP (Online Transaction Processing) DBMS:


•Optimized for transaction-oriented applications.
•Ensures ACID (Atomicity, Consistency, Isolation, Durability) properties.
•Examples: MySQL, PostgreSQL, Oracle.

2.) OLAP (Online Analytical Processing) DBMS:


•Optimized for read-heavy workloads and complex queries.
•Used in data warehousing and business intelligence applications.
•Examples: Amazon Redshift, Microsoft SQL Server Analysis Services
(SSAS).

You might also like