0% found this document useful (0 votes)
24 views

Database_print

The document provides an overview of the Entity Relationship (ER) model in database management systems (DBMS), detailing its components, types of relationships, and the significance of ER diagrams. It also explains distributed database systems, distinguishing between homogeneous and heterogeneous types, and outlines various architectures such as client-server and peer-to-peer. Additionally, it discusses the advantages of distributed databases and storage systems, emphasizing improved performance and reliability.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
24 views

Database_print

The document provides an overview of the Entity Relationship (ER) model in database management systems (DBMS), detailing its components, types of relationships, and the significance of ER diagrams. It also explains distributed database systems, distinguishing between homogeneous and heterogeneous types, and outlines various architectures such as client-server and peer-to-peer. Additionally, it discusses the advantages of distributed databases and storage systems, emphasizing improved performance and reliability.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 36

Unit 1

What is entity relationship model in DBMS?


Entity Relationship Diagram – ER Diagram in DBMS. An Entity–relationship model (ER
model) describes the structure of a database with the help of a diagram, which is known
as Entity Relationship Diagram (ER Diagram). An ER model is a design or blueprint of a
database that can later be implemented as a database.

What are the 3 main components of entity relationship modeling?

The three main components of the ER Model are entities, attributes and relationships.

What is entity relationship model with example?

An ER model is used to represent real-world objects. An Entity is a thing or object in real


world that is distinguishable from surrounding environment. For example, each employee of
an organization is a separate entity

What are the types of entity relationship?

There are three types of relationships that can exist between two entities.
 One-to-One Relationship.
 One-to-Many or Many-to-One Relationship.
 Many-to-Many Relationship.
What are the 4 types of relationships?
There are four basic types of relationships: family relationships, friendships,
acquaintanceships, and romantic relationships. Other more nuanced types of relationships
might include work relationships, teacher/student relationships, and community or group
relationships.

What are the 3 types of relationships in a database?


There are 3 different types of relations in the database:
 one-to-one.
 one-to-many, and.
 many-to-many.
What is ER diagram explain?
An Entity Relationship (ER) Diagram is a type of flowchart that illustrates how
“entities” such as people, objects or concepts relate to each other within a system.

Why use ER Diagrams?


Here, are prime reasons for using the ER Diagram

 Helps you to define terms related to entity relationship modeling


 Provide a preview of how all your tables should connect, what fields are going to be
on each table
 Helps to describe entities, attributes, relationships
 ER diagrams are translatable into relational tables which allows you to build
databases quickly
 ER diagrams can be used by database designers as a blueprint for implementing data
in specific software applications
 The database designer gains a better understanding of the information to be contained
in the database with the help of ERP diagram
 ERD Diagram allows you to communicate with the logical structure of the database to
users

ER Diagrams Symbols & Notations

 Rectangles: This Entity Relationship Diagram symbol represents entity types


 Ellipses : Symbol represent attributes
 Diamonds: This symbol represents relationship types
 Lines: It links attributes to entity types and entity types with other relationship types
 Primary key: attributes are underlined
 Double Ellipses: Represent multi-valued attributes

ER Diagram Symbols

Components of the ER Diagram


This model is based on three basic concepts:


 Attributes
 Relationships

ENTITY
An entity can be place, person, object, event or a concept, which stores data in the database.
The characteristics of entities are must have an attribute, and a unique key. Every entity is
made up of some ‘attributes’ which represent that entity.

Examples of entities:

 Person: Employee, Student, Patient


 Place: Store, Building
 Object: Machine, product, and Car
 Event: Sale, Registration, Renewal
 Concept: Account, Course

Entity set:
Student

An entity set is a group of similar kind of entities. It may contain entities with attribute
sharing similar values. Entities are represented by their properties, which also called
attributes. All attributes have their separate values. For example, a student entity may have a
name, age, class, as attributes.

Relationship
Relationship is nothing but an association among two or more entities. E.g., Tom works in
the Chemistry department.

Entities take part in relationships. We can often identify relationships with verbs or verb
phrases.

For example:

 You are attending this lecture


 I am giving the lecture
 Just loke entities, we can classify relationships according to relationship-types:
 A student attends a lecture
 A lecturer is giving a lecture.

Weak Entities
A weak entity is a type of entity which doesn’t have its key attribute. It can be identified
uniquely by considering the primary key of another entity. For that, weak entity sets need to
have participation.
Strong Entity Set
Strong entity set always has a primary key.
It is represented by a rectangle symbol.
It contains a Primary key represented by the underline symbol.
The member of a strong entity set is called as dominant entity
set.
Primary Key is one of its attributes which helps to identify its
member.
In the ER diagram the relationship between two strong entity set
shown by using a diamond symbol.
The connecting line of the strong entity set with the relationship
is single.

Weak Entity Set


It does not have enough attributes to build a primary key.
It is represented by a double rectangle symbol.
It contains a Partial Key which is represented by a dashed underline symbol.
The member of a weak entity set called as a subordinate entity set.
In a weak entity set, it is a combination of primary key and partial key of the
strong entity set.
The relationship between one strong and a weak entity set shown by using the double
diamond symbol.
The line connecting the weak entity set for identifying relationship is double.

Types of Attributes Description


Simple attributes can’t be divided any further. For example, a
Simple attribute
student’s contact number. It is also called an atomic value.
It is possible to break down composite attribute. For example, a
Composite attribute student’s full name may be further divided into first name,
second name, and last name.
Derived attribute This type of attribute does not include in the physical database.
However, their values are derived from other attributes present in
the database. For example, age should not be stored directly.
Instead, it should be derived from the DOB of that employee.
Multivalued attributes can have more than one values. For
example, a student can have more than one mobile number,
Multivalued attribute
email address, etc.

Relational data model

What is database model explain relational model?

The relational model (RM) for database management is an approach to managing data
using a structure and language consistent with first-order predicate logic

Basic relationship:

 One-to-One Relationships
 One-to-Many Relationships
 May to One Relationships
 Many-to-Many Relationships

1.One-to-one:One entity from entity set X can be associated with at most one entity of entity
set Y and vice versa.
Example: One student can register for numerous
courses. However, all those courses have a single
line back to that one student.

2.One-to-many:

One entity from entity set X can be associated with


multiple entities of entity set Y, but an entity from entity
set Y can be associated with at least one entity.

For example, one class is consisting of multiple students

3. Many to One

More than one entity from entity set X can be associated with
at most one entity of entity set Y. However, an entity from
entity set Y may or may not be associated with more than one
entity from entity set X.

For example, many students belong to the same class.

4. Many to Many:

One entity from X can be associated with more than one entity from Y and vice versa.

For example, Students as a group are associated with multiple faculty members, and faculty
members can be associated with multiple students.
Mapping Process

ER diagrams mainly comprise of −

 Entity and its attributes


 Relationship, which is association among entities.

Mapping Entity

An entity is a real-world object with some attributes.

Mapping Process (Algorithm)


 Create table for each entity.
 Entity's attributes should become fields of tables
with their respective data types.
 Declare primary key.

Mapping Relationship

A relationship is an association among entities.

Mapping Process
 Create table for a relationship.
 Add the primary keys of all participating Entities as fields of table with their
respective data types.
 If relationship has any attribute, add each attribute as field of table.
 Declare a primary key composing all the primary keys of participating entities.
 Declare all foreign key constraints.

Mapping Weak Entity Sets

A weak entity set is one which does not have any primary key associated with it.
Mapping Process
 Create table for weak entity set.
 Add all its attributes to table as field.
 Add the primary key of identifying entity
set.
 Declare all foreign key constraints.

Mapping Hierarchical Entities

ER specialization or generalization comes in the form of hierarchical entity sets.


Mapping Process
 Create tables for all higher-level
entities.
 Create tables for lower-level entities.
 Add primary keys of higher-level
entities in the table of lower-level
entities.
 In lower-level tables, add all other
attributes of lower-level entities.
 Declare primary key of higher-level
table and the primary key for lower-
level table.
 Declare foreign key constraints.

Relational Algebra
Relational Algebra is procedural query language, which takes Relation as input and
generate relation as output. Relational algebra mainly provides theoretical foundation for
relational databases and SQL.
Operators in Relational Algebra
Projection (π)
Projection is used to project required column data from a relation.
Example :
R
(A B C)
----------
1 2 4
2 2 3
3 2 3
4 3 4
π (BC)
B C
-----
2 4
2 3
3 4
Note: By Default projection removes duplicate data.

Selection (σ)
Selection is used to select required tuples of the relations.
for the above relation
σ (c>3)R
will select the tuples which have c more than 3.
π (σ (c>3)R ) will show following tuples.

A B C
-------
1 2 4
4 3 4
Union (U)
Union operation in relational algebra is same as union operation in set theory, only
constraint is for union of two relation both relation must have same set of Attributes.

Set Difference (-)


Set Difference in relational algebra is same set difference operation as in set theory with the
constraint that both relation should have same set of attributes.

Rename (ρ)
Rename is a unary operation used for renaming attributes of a relation.
ρ (a/b)R will rename the attribute ‘b’ of relation by ‘a’.

Cross Product (X)


Cross product between two relations let say A and B, so cross product between A X B will
results all the attributes of A followed by each attribute of B. Each record of A will pairs
with every record of B.
Note: if A has ‘n’ tuples and B has ‘m’ tuples then A X B will have ‘n*m’ tuples.
Natural Join (⋈)
Natural join is a binary operator. Natural join between two or more relations will result set
of all combination of tuples where they have equal common attribute.
Conditional Join
Conditional join works similar to natural join. In natural join, by default condition is equal
between common attribute while in conditional join we can specify the any condition such
as greater than, less than, not equal.
Unit 2

A distributed database system allows applications to access data from local and remote
databases. In a homogenous distributed database system, each database is an Oracle
Database. In a heterogeneous distributed database system, at least one of the databases is
not an Oracle Database. Distributed databases use a client/server architecture to process
information requests.

This section contains the following topics:

 Homogenous Distributed Database Systems

 Heterogeneous Distributed Database Systems

 Client/Server Database Architecture

Types of Distributed Databases

Distributed databases can be broadly classified into homogeneous and heterogeneous


distributed database environments, each with further sub-divisions, as shown in the following
illustration.

Homogeneous Distributed Databases


In a homogeneous distributed database, all the sites use identical DBMS and operating
systems. Its properties are −
 The sites use very similar software.
 The sites use identical DBMS or DBMS from the same vendor.
 Each site is aware of all other sites and cooperates with other sites to process
user requests.
 The database is accessed through a single interface as if it is a single database.
Types of Homogeneous Distributed Database
There are two types of homogeneous distributed database −
 Autonomous − Each database is independent that functions on its own. They
are integrated by a controlling application and use message passing to share
data updates.
 Non-autonomous − Data is distributed across the homogeneous nodes and a
central or master DBMS co-ordinates data updates across the sites.

Heterogeneous Distributed Databases


In a heterogeneous distributed database, different sites have different operating systems,
DBMS products and data models. Its properties are −
 Different sites use dissimilar schemas and software.
 The system may be composed of a variety of DBMSs like relational, network,
hierarchical or object oriented.
 Query processing is complex due to dissimilar schemas.
 Transaction processing is complex due to dissimilar software.
 A site may not be aware of other sites and so there is limited co-operation in
processing user requests.
Types of Heterogeneous Distributed Databases
 Federated − The heterogeneous database systems are independent in nature
and integrated together so that they function as a single database system.
 Un-federated − The database systems employ a central coordinating module
through which the databases are accessed.

Distributed DBMS Architectures

DDBMS architectures are generally developed depending on three parameters −


 Distribution − It states the physical distribution of data across the different
sites.
 Autonomy − It indicates the distribution of control of the database system and
the degree to which each constituent DBMS can operate independently.
 Heterogeneity − It refers to the uniformity or dissimilarity of the data models,
system components and databases.

Architectural Models

Some of the common architectural models are −

 Client - Server Architecture for DDBMS


 Peer - to - Peer Architecture for DDBMS
 Multi - DBMS Architecture
client - Server Architecture for DDBMS
This is a two-level architecture where the functionality is divided into servers and clients. The
server functions primarily encompass data management, query processing, optimization and
transaction management. Client functions include mainly user interface. However, they have
some functions like consistency checking and transaction management.
The two different client - server architecture are −

 Single Server Multiple Client


 Multiple Server Multiple Client (shown in the following diagram)

Peer- to-Peer Architecture for DDBMS


In these systems, each peer acts both as a client and a server for imparting database services.
The peers share their resource with other peers and co-ordinate their activities.
This architecture generally has four levels of schemas –
Global Conceptual Schema − Depicts the global logical view of data.
Local Conceptual Schema − Depicts logical data organization at each site.
Local Internal Schema − Depicts physical data organization at each site.
ExternalSchema − Depicts user view of data.
Multi - DBMS Architectures
This is an integrated database system formed by a collection of two or more autonomous
database systems.
Multi-DBMS can be expressed through six levels of schemas −
 Multi-database View Level − Depicts multiple user views comprising of
subsets of the integrated distributed database.
 Multi-database Conceptual Level − Depicts integrated multi-database that
comprises of global logical multi-database structure definitions.
 Multi-database Internal Level − Depicts the data distribution across different
sites and multi-database to local data mapping.
 Local database View Level − Depicts public view of local data.
 Local database Conceptual Level − Depicts local data organization at each
site.
 Local database Internal Level − Depicts physical data organization at each
site.
There are two design alternatives for multi-DBMS −

 Model with multi-database conceptual level.


 Model without multi-database conceptual level.
Distributed Data Storage
A distributed database is a database that consists of two or more files located in different
sites either on the same network or on entirely different networks.

What are the advantages of distributed database?


Advantages of Distributed database
 Management of data with different level of transparency – ...
 Increased Reliability and availability – ...
 Easier Expansion – ...
 Improved Performance –

What are the advantages of using a distributed storage system?


Therefore, compared to the traditional NAS and DAS storages, the distributed storage
has some natural advantages, such as:
 High performance. ...
 Support tiered storage. ...
 Multiple copy consistency. ...
 Disaster recovery and backup. ...
 Flexible expansion.
Distributed Query processing
Definition

Distributed query processing is the procedure of answering queries (which means mainly
read operations on large data sets) in a distributed environment where data is managed at
multiple sites in a computer network.

Costs (Transfer of data) of Distributed Query processing :

1. We can transfer the data from S2 to S1 and then process the query
2. We can transfer the data from S1 to S2 and then process the query
3. We can transfer the data from S1 and S2 to S3 and then process the query. So
the choice depends on various factors like, the size of relations and the results,
the communication cost between different sites, and at which the site result will
be utilized.
Commonly, the data transfer cost is calculated in terms of the size of the messages. By
using the below formula, we can calculate the data transfer cost:
Data transfer cost = C * Size
2. Using Semi join in Distributed Query processing :
The semi-join operation is used in distributed query processing to reduce the number of
tuples in a table before transmitting it to another site.
Example : Find the amount of data transferred to execute the same query given in the
above example using semi-join operation.
Answer : The following strategy can be used to execute the query.
1. Select all (or Project) the attributes of the EMPLOYEE table at site 1 and then
transfer them to site 3. For this, we will transfer NAME, DID(EMPLOYEE) and
the size is 25 * 1000 = 25000 bytes.
2. Transfer the table DEPARTMENT to site 3 and join the projected attributes of
EMPLOYEE with this table. The size of the DEPARTMENT table is 25 * 50 =
1250.
UNIT 3

Extensible Markup Language (XML) is a set of rules for encoding documents in


machinereadable
form. It is defined in the XML 1.0 Specification produced by the W3C, and several
other related specifications, all gratis open standards.
XML's design goals emphasize simplicity, generality, and usability over the Internet. It is a
textual data format with strong support via Unicode for the languages of the world. Although
the design of XML focuses on documents, it is widely used for the representation of arbitrary
data structures, for example in web services.
Many application programming interfaces (APIs) have been developed that software
developers use to process XML data, and several schema systems exist to aid in the definition

of XML-based languages.

TYPES

What are the types of XML databases?


There are two types of XML databases.
 XML-enabled database.
 Native XML database (NXD)
Application of XML
XML can be used to exchange the information between organizations and systems. XML
can be used for offloading and reloading of databases. XML can be used to store and arrange
the data, which can customize your data handling needs. XML can easily be merged with
style sheets to create almost any desired output.

What is XML? The Extensible Markup Language (XML) is a simple text-based format
for representing structured information: documents, data, configuration, books,
transactions, invoices, and much more. It was derived from an older standard format called
SGML (ISO 8879), in order to be more suitable for Web use.
Advantages of XML
 XML uses human, not computer, language. XML is readable and understandable,
even by novices, and no more difficult to code than HTML.
 XML is completely compatible with Java™ and 100% portable. Any application that
can process XML can use your information, regardless of platform.
 XML is extendable.

A basic summary of the main features of XML follows:


 Excellent for handling data with a complex structure or atypical data.
 Data described using markup language.
 Text data description.
 Human- and computer-friendly format.
 Handles data in a tree structure having one-and only one-root element.

DIsadav

XML syntax is redundant or large relative to binary representations of similar data,


especially with tabular data. 3) XML syntax is verbose, especially for human readers,
relative to other alternatives 'text-based' data transmission formats.

1. Structured data –
Structured data is data whose elements are addressable for effective analysis. It
has been organized into a formatted repository that is typically a database. It
concerns all data which can be stored in database SQL in a table with rows and
columns. They have relational keys and can easily be mapped into pre-designed
fields. Today, those data are most processed in the development and simplest way
to manage information. Example: Relational data.

2. Semi-Structured data –
Semi-structured data is information that does not reside in a relational database
but that has some organizational properties that make it easier to analyze. With
some processes, you can store them in the relation database (it could be very hard
for some kind of semi-structured data), but Semi-structured exist to ease
space. Example: XML data.

3. Unstructured data –
Unstructured data is a data which is not organized in a predefined manner or does
not have a predefined data model, thus it is not a good fit for a mainstream
relational database. So for Unstructured data, there are alternative platforms for
storing and managing, it is increasingly prevalent in IT systems and is used by
organizations in a variety of business intelligence and analytics
applications. Example: Word, PDF, Text, Media logs.

Differences between Structured, Semi-structured and Unstructured data:


Unstructured
Properties Structured data Semi-structured data data

It is based on It is based on It is based on


Relational database XML/RDF(Resource character and
Technology table Description Framework). binary data

Matured transaction
and various No transaction
Transaction concurrency Transaction is adapted from management and
management techniques DBMS not matured no concurrency

Version Versioning over Versioning over tuples or Versioned as a


management tuples,row,tables graph is possible whole

It is more flexible than It is more


It is schema structured data but less flexible and
dependent and less flexible than unstructured there is absence
Flexibility flexible data of schema

It is very difficult to It’s scaling is simpler than It is more


Scalability scale DB schema structured data scalable.

New technology, not very


Robustness Very robust spread —

Structured query Only textual


Query allow complex Queries over anonymous queries are
performance joining nodes are possible possible

What is XQuery?

XQuery is to XML what SQL is to databases.

XQuery is designed to query XML data.

XQuery Example
for $x in doc("books.xml")/bookstore/book
where $x/price>30
order by $x/title
return $x/title
What is XQuery?

 XQuery is the language for querying XML data


 XQuery for XML is like SQL for databases
 XQuery is built on XPath expressions
 XQuery is supported by all major databases
 XQuery is a W3C Recommendation

XQuery is About Querying XML

XQuery is a language for finding and extracting elements and attributes from XML
documents.

Here is an example of what XQuery could solve:

"Select all CD records with a price less than $10 from the CD collection stored in
cd_catalog.xml"

XQuery and XPath

XQuery 1.0 and XPath 2.0 share the same data model and support the same functions and
operators. If you have already studied XPath you will have no problems with understanding
XQuery.

XQuery - Examples of Use

XQuery can be used to:

 Extract information to use in a Web Service


 Generate summary reports
 Transform XML data to XHTML
 Search Web documents for relevant information

XML Query Using XPath and XQuery

The basic steps for querying XML content are:


1. Compose an XPath or XQuery query and use BTEQ or SQL Assistant to test it
against XML documents in the database.
2. Depending on the intended use, you can use these methods and functions to evaluate
the XML queries:
o XMLEXTRACT method
o EXISTSNODE method
o XMLQUERY function
o XMLTABLE function

The XML Example Document


We will use the following XML document in the examples below.

"books.xml":

<?xml version="1.0" encoding="UTF-8"?>

<bookstore>

<book category="COOKING">
<title lang="en">Everyday Italian</title>
<author>Giada De Laurentiis</author>
<year>2005</year>
<price>30.00</price>
</book>

<book category="CHILDREN">
<title lang="en">Harry Potter</title>
<author>J K. Rowling</author>
<year>2005</year>
<price>29.99</price>
</book>

<book category="WEB">
<title lang="en">XQuery Kick Start</title>
<author>James McGovern</author>
<author>Per Bothner</author>
<author>Kurt Cagle</author>
<author>James Linn</author>
<author>Vaidyanathan Nagarajan</author>
<year>2003</year>
<price>49.99</price>
</book>

<book category="WEB">
<title lang="en">Learning XML</title>
<author>Erik T. Ray</author>
<year>2003</year>
<price>39.95</price>
</book>

</bookstore>

UNIT 4
Introduction
NOSQL
 Not only SQL
 Most NOSQL systems are distributed databases
 or distributed storage systems
 Focus on semi-structured data storage, high
 performance, availability, data replication, and
 scalability
NOSQL systems focus on storage of “big data”
Typical applications that use NOSQL
 Social media
 Web links
 User profiles
 Marketing and sales
 Posts and tweets
 Road maps and spatial data
 Email
BigTable
Google’s proprietary NOSQL system
Column-based or wide column store
DynamoDB (Amazon)
Key-value data store
Cassandra (Facebook)
Uses concepts from both key-value store and
column-based systems
MongoDB and CouchDB
Document stores
Neo4J and GraphBase
Graph-based NOSQL systems
OrientDB
Combines several concepts
Database systems classified on the object model
Or native XML model
The CAP theorem is a belief from theoretical computer science about distributed data
stores that claims, in the event of a network failure on a distributed database, it is possible to
provide either consistency or availability—but not both.
Let’s take a look.
What is the CAP theorem?
The CAP Theorem is comprised of three components (hence its name) as they relate to
distributed data stores:

 Consistency. All reads receive the most recent write or an error.


 Availability. All reads contain data, but it might not be the most recent.
 Partition tolerance. The system continues to operate despite network failures
(ie; dropped partitions, slow network connections, or unavailable network
connections between nodes.)

In normal operations, your data store provides all three functions. But the CAP theorem
maintains that when a distributed database experiences a network failure, you can provide
either consistency or availability.

 High consistency comes at the cost of lower availability.


 High availability comes at the cost of lower consistency.

NoSQL
NoSQL databases do not require a schema, and don’t enforce relations between tables. All its
documents are JSON documents, which are complete entities one can readily read and
understand. They are widely recognized for:

 Ease-of-use
 Scalable performance
 Strong resilience
 Wide availability

Examples of NoSQL databases include:

 Cloud Firestore
 Firebase Real-time DB
 MongoDB
 MarkLogic
 Couchbase
 CloudDB
 Amazon DynamoDB

NoSQL
NoSQL databases do not require a schema, and don’t enforce relations between tables. All its
documents are JSON documents, which are complete entities one can readily read and
understand. They are widely recognized for:

 Ease-of-use
 Scalable performance
 Strong resilience
 Wide availability

Examples of NoSQL databases include:

 Cloud Firestore
 Firebase Real-time DB
 MongoDB
 MarkLogic
 Couchbase
 CloudDB
 Amazon DynamoDB

Consistency in databases
Consistent databases should be used when the value of the information returned needs to be
accurate.

Financial data is a good example. When a user logs in to their banking institution, they do not
want to see an error that no data is returned, or that the value is higher or lower than it
actually is. Banking apps should return the exact value of a user’s account information. In
this case, banks would rely on consistent databases.
Examples of a consistent database include:

 Bank account balances


 Text messages

Database options for consistency:

 MongoDB
 Redis
 HBase

Availability in databases
Availability databases should be used when the service is more important than the
information.

An example of having a highly available database can be seen in e-commerce businesses.


Online stores want to make their store and the functions of the shopping cart available 24/7 so
shoppers can make purchases exactly when they need.

Database options for availability:

 Cassandra
 DynamoDB
 Cosmos DB

Some database options, like Cosmos and Cassandra, allow a user to turn a knob on which
guarantee they prefer—consistency or availability.
Unit 5

What is Database Security?


Database security includes a variety of measures used to secure
database management systems from malicious cyber-attacks and
illegitimate use. Database security programs are designed to
protect not only the data within the database, but also the data
management system itself, and every application that accesses it,
from misuse, damage, and intrusion.

Database security encompasses tools, processes, and


methodologies which establish security inside a database
environment.

Database Security Threats


Many software vulnerabilities, misconfigurations, or patterns of
misuse or carelessness could result in breaches. Here are a
number of the most known causes and types of database
security cyber threats.

Insider Threats

An insider threat is a security risk from one of the following three


sources, each of which has privileged means of entry to the
database:

 A malicious insider with ill-intent


 A negligent person within the organization who exposes the
database to attack through careless actions
 An outsider who obtains credentials through social
engineering or other methods, or gains access to the
database’s credentials

An insider threat is one of the most typical causes of database


security breaches and it often occurs because a lot of employees
have been granted privileged user access.
Human Error

Weak passwords, password sharing, accidental erasure or


corruption of data, and other undesirable user behaviors are still
the cause of almost half of data breaches reported.

Exploitation of Database Software Vulnerabilities

Attackers constantly attempt to isolate and target vulnerabilities


in software, and database management software is a highly
valuable target. New vulnerabilities are discovered daily, and all
open source database management platforms and commercial
database software vendors issue security patches regularly.
However, if you don’t use these patches quickly, your database
might be exposed to attack.

Even if you do apply patches on time, there is always the risk


of zero-day attacks, when attackers discover a vulnerability, but it
has not yet been discovered and patched by the database vendor.

SQL/NoSQL Injection Attacks

A database-specific threat involves the use of arbitrary non-SQL


and SQL attack strings into database queries. Typically, these are
queries created as an extension of web application forms, or
received via HTTP requests. Any database system is vulnerable to
these attacks, if developers do not adhere to secure coding
practices, and if the organization does not carry out regular
vulnerability testing.

Buffer Overflow Attacks

Buffer overflow takes place when a process tries to write a large


amount of data to a fixed-length block of memory, more than it is
permitted to hold. Attackers might use the excess data, kept in
adjacent memory addresses, as the starting point from which to
launch attacks.
Denial of Service (DoS/DDoS) Attacks

In a denial of service (DoS) attack, the cybercriminal overwhelms


the target service—in this instance the database server—using a
large amount of fake requests. The result is that the server
cannot carry out genuine requests from actual users, and often
crashes or becomes unstable.

In a distributed denial of service attack (DDoS), fake traffic is


generated by a large number of computers, participating in
a botnet controlled by the attacker. This generates very large
traffic volumes, which are difficult to stop without a highly
scalable defensive architecture. Cloud-based DDoS
protection services can scale up dynamically to address very
large DDoS attacks.

Malware

Malware is software written to take advantage of vulnerabilities or


to cause harm to a database. Malware could arrive through any
endpoint device connected to the database’s network. Malware
protection is important on any endpoint, but especially so on
database servers, because of their high value and sensitivity.

An Evolving IT Environment

The evolving IT environment is making databases more


susceptible to threats. Here are trends that can lead to new types
of attacks on databases, or may require new defensive measures:

 Growing data volumes—storage, data capture, and


processing is growing exponentially across almost all
organizations. Any data security practices or tools must be
highly scalable to address distant and near-future
requirements.
 Distributed infrastructure—network environments are
increasing in complexity, especially as businesses transfer
workloads to hybrid cloud or multi-cloud architectures,
making the deployment, management, and choice of
security solutions more difficult.
 Increasingly tight regulatory requirements—the
worldwide regulatory compliance landscape is growing in
complexity, so following all mandates are becoming more
challenging.
 Cybersecurity skills shortage—there is a global shortage
of skilled cybersecurity professionals, and organizations are
finding it difficult to fill security roles. This can make it more
difficult to defend critical infrastructure, including databases.

How Can You Secure Your Database Server?


A database server is a physical or virtual machine running the
database. Securing a database server, also known as
“hardening”, is a process that includes physical security, network
security, and secure operating system configuration.

Ensure Physical Database Security

Refrain from sharing a server for web applications and database


applications, if your database contains sensitive data. Although it
could be cheaper, and easier, to host your site and database
together on a hosting provider, you are placing the security of
your data in someone else’s hands.
If you do rely on a web hosting service to manage your database,
you should ensure that it is a company with a strong security
track record. It is best to stay clear of free hosting services due to
the possible lack of security.

If you manage your database in an on-premise data center, keep


in mind that your data center is also prone to attacks from
outsiders or insider threats. Ensure you have physical security
measures, including locks, cameras, and security personnel in
your physical facility. Any access to physical servers must be
logged and only granted to authorized individuals.

In addition, do not leave database backups in locations that are


publicly accessible, such as temporary partitions, web folders, or
unsecured cloud storage buckets.

Lock Down Accounts and Privileges

Let’s consider the Oracle database server. After the database is


installed, the Oracle database configuration assistant (DBCA)
automatically expires and locks most of the default database user
accounts.

If you install an Oracle database manually, this doesn’t happen


and default privileged accounts won’t be expired or locked. Their
password stays the same as their username, by default.
An attacker will try to use these credentials first to connect to the
database.

It is critical to ensure that every privileged account on a database


server is configured with a strong, unique password. If accounts
are not needed, they should be expired and locked.

For the remaining accounts, access has to be limited to the


absolute minimum required. Each account should only have
access to the tables and operations (for example, SELECT or
INSERT) required by the user. Avoid creating user accounts with
access to every table in the database.
Regularly Patch Database servers

Ensure that patches remain current. Effective database patch


management is a crucial security practice because attackers are
actively seeking out new security flaws in databases, and
new viruses and malware appear on a daily basis.

A timely deployment of up-to-date versions of database service


packs, critical security hotfixes, and cumulative updates will
improve the stability of database performance.

Disable Public Network Access

Organizations store their applications in databases. In most real-


world scenarios, the end-user doesn’t require direct access to the
database. Thus, you should block all public network access to
database servers unless you are a hosting provider. Ideally, an
organization should set up gateway servers (VPN or SSH tunnels)
for remote administrators.

Encrypt All Files and Backups

Irrespective of how solid your defenses are, there is always a


possibility that a hacker may infiltrate your system. Yet, attackers
are not the only threat to the security of your database. Your
employees may also pose a risk to your business. There is always
the possibility that a malicious or careless insider will gain access
to a file they don’t have permission to access.

Encrypting your data makes it unreadable to both attackers and


employees. Without an encryption key, they cannot access it, this
provides a last line of defense against unwelcome intrusions.
Encrypt all-important application files, data files, and backups so
that unauthorized users cannot read your critical data.

Database Security Best Practices


Here are several best practices you can use to improve the
security of sensitive databases.
Actively Manage Passwords and User Access

If you have a large organization, you must think about automating


access management via password management or access
management software. This will provide permitted users with a
short-term password with the rights they need every time they
need to gain access to a database.

It also keeps track of the activities completed during that time


frame and stops administrators from sharing passwords. While
administrators may feel that sharing passwords is convenient,
however, doing so makes effective database accountability and
security almost impossible.

In addition, the following security measures are recommended:

 Strong passwords must be enforced


 Password hashes must be salted and stored encrypted
 Accounts must be locked following multiple login attempts
 Accounts must be regularly reviewed and deactivated if staff
move to different roles, leave the company, or no longer
require the same level of access

Test Your Database Security

Once you have put in place your database security infrastructure,


you must test it against a real threat. Auditing or performing
penetration tests against your own database will help you get into
the mindset of a cybercriminal and isolate any vulnerabilities you
may have overlooked.

To make sure the test is comprehensive, involve ethical hackers


or recognized penetration testing services in your security testing.
Penetration testers provide extensive reports listing database
vulnerabilities, and it is important to quickly investigate and
remediate these vulnerabilities. Run a penetration test on a
critical database system at least once per year.

Use Real-Time Database Monitoring

Continually scanning your database for breach attempts increases


your security and lets you rapidly react to possible attacks.
In particular, File Integrity Monitoring (FIM) can help you log all
actions carried out on the database’s server and to alert you of
potential breaches. When FIM detects a change to important
database files, ensure security teams are alerted and able to
investigate and respond to the threat.

Use Web Application and Database Firewalls

You should use a firewall to protect your database server from


database security threats. By default, a firewall does not permit
access to traffic. It needs to also stop your database from starting
outbound connections unless there is a particular reason for doing
so.

As well as safeguarding the database with a firewall, you must


deploy a web application firewall (WAF). This is because attacks
aimed at web applications, including SQL injection, can be used to
gain illicit access to your databases.

A database firewall will not stop most web application attacks,


because traditional firewalls operate at the network layer, while
web application layers operate at the application layer (layer 7 of
the OSI model). A WAF operates at layer 7 and is able to detect
malicious web application traffic, such as SQL injection attacks,
and block it before it can harm your database.

Imperva Database Security


Imperva provides an industry-leading Web Application Firewall,
which can prevent web application attacks that affect databases,
including SQL injection. We also provide file integrity protection
(FIM) and file security technology, defending sensitive files from
cybercriminals and malicious insiders.

In addition, Imperva protects all cloud-based data stores to


ensure compliance and preserve the agility and cost benefits you
get from your cloud investments:

Cloud Data Security – Simplify securing your cloud databases to


catch up and keep up with DevOps. Imperva’s solution enables
cloud-managed services users to rapidly gain visibility and control
of cloud data.

Database Security – Imperva delivers analytics, protection, and


response across your data assets, on-premise and in the cloud –
giving you the risk visibility to prevent data breaches and avoid
compliance incidents. Integrate with any database to gain instant
visibility, implement universal policies, and speed time to value.

Data Risk Analysis – Automate the detection of non-compliant,


risky, or malicious data access behavior across all of your
databases enterprise-wide to accelerate remediation.

You might also like