Lecture 5
Week 5
Introduction to Database
Security Requirements
Reliability and Integrity
Sensitive Data
Inference
Definition:
“Security protects data from intentional or
accidental misuse or destruction, by controlling
access to the data.” – Stamper & Price
“Database security is concerned with the ability
of the system to enforce a security policy
governing the disclosure, modification or
destruction of information.” - Pangalos
3
It is the protection that is located in
databases from unauthorized access
Why securing data is important?
Information is a critical resource in
enterprise
Securing data has become a billion dollar
industry
People want to secure their confidential
information not only from hackers but also
from legal, professional direct marketing
corporations
4
Moral/Ethical
There may be moral reasons for controlling who has access
to information. For example, medical records are
confidential because of people’s right to privacy.
Legal requirements
The Data Protection Act requires companies to register personal
data with the data protection registrar.
Commercial security
Information held by companies is a valuable resource which may be
useful to competitors
Fraud/Sabotage
Information may be misused, for example, insider dealing, or used
to mislead.
Mistakes
Many problems are not malicious but are caused by users
accidentally changing the data
5
Authorization Policies
Disclosure and modification of data
Data Consistency Policies
Consistency and correctness of data
Availability Policies
Availability of information to users
Identification/Authentication/Audit Policies
Authorizing users to access data
6
How valuable is the data?
Which data must be secured?
What will illegal access to the data cost?
What are the implications of
changed/destroyed data?
Will security measures affect the proper
functioning of the database?
How can unauthorized access occur?
7
Database
Collection of data.
Rules describing how data are related.
DBA (database administrator)
Defines rules.
Determines access.
DBMS
A program to interact with database.
8
Characteristics of
Good SMS
Minimal Data
Data independence Shared access Data integrity
redundancy consistency
9
Privacy
Signifies that an unauthorized user
cannot disclose data
Integrity
Ensures that an unauthorized user cannot
modify data
Availability
Ensures that data be made available to
the authorized user unfailingly
10
Four levels of enforcing database security:
Physical security
Such as storage medium safekeeping and fire
protection
Operating system security
Such as the use of an access control matrix, capability
list and accessor list
DBMS security
Such as protection mechanisms and query modification
Data encryption
Such as RSA scheme and data encryption standard
11
The first three level (physical security,
operating system security and DBMS security)
cannot provide a totally satisfactory solution
to the database problem because of the
following reasons:
Hard to control the disclosure of raw data
Invalid to control the disclosure of
sensitive data
Hard to control the disclosure of
confidential data in a distributed database
system
Hard to verify that the origin of a data
item is authentic
12
Therefore….tosolve the problem is
to using ENCRYPTION methods to
enforce database security.
How?
Data is encrypted into ciphertext which
only can be decrypted with the proper
decryption keys
It eliminates the problem of data
disclosure
Data integrity problem resolved since an
intruder cannot change ciphertext
without the encryption key
13
Database Structure
Database file consists of records.
Database records consists of fields.
Name of column is an attribute.
A relation is a set of columns.
Database Format
Logical format is defined by rules.
Logical structure is a schema.
Physical format is defined by
storage schema.
14
Select
Extracts certain rows from the database
Project
Extracts values from the specified fields
Join
Merges two sub schema
15
Shared access
Minimal redundancy
Data consistency
Data integrity
Controlled access
16
Physical Database Integrity
Logical Database Integrity
Element Integrity
Access Control
User Authentication
Availability
17
For a database:
Users must be able to trust the
accuracy of the data values
Updates must be performed only
to authorized individuals
Data must be protected from
corruption either by an outside
illegal program action and an
outside force
18
Situations Affecting Integrity
Damage to entire database.
Damage to individual database
item.
One form of Protection:
Back up
19
DBMS maintains element integrity in three
ways:
Field checks
Tests the appropriate values in a position such
as null value control, default value
Access control
Controls the access rights to users such as the
authorization to update certain element
Change log
Maintaining change log for the database (a
change log is a list of every change made to the
database
20
Audit trail desirable in order to:
Determine who did what.
Prevent incremental access.
Audit trail of all accesses is impractical:
Slow
Large
Possible over reporting.
passthrough problem - field may be
accessed during select operation but
values never reported to user.
21
This helps to:
Maintain the integrity of a database or
discover who had affected what values
and when
Enable users to build up access to
protected data incrementally.
Audittrail must include access at the
record, field and element levels
22
Recall
DBS - enforces DBA's policy.
Operating System vs. Databases
Access control for Operating Systems
Deals with unrelated data.
Deals with entire files.
Access control for Databases
Deals with records and fields.
Concerned with inference of one field
from another.
Access control list for several hundred
files is easier to implement than access
control list for a database
23
User Authentication
DBMS runs on top of operating system.
No trusted path.
Must be suspicious of information received.
(principle of mutual suspicion)
Availability
Arbitration of two users' request for the same
record.
Withholding some non protected data to avoid
revealing protected data.
24
Database Integrity
Concern that the database as a whole is
protected from damage.
Element Integrity
Concern that the value of a specific
element is written or changed only by
actions of authorized users.
Element Accuracy
Concern that only correct values are
written into the elements of a database.
25
Problem
Failure of system while modifying data.
Results
Single field - half of a field being updated may show the old
data.
Multiple fields - no single field reflects an obvious error.
Solution
Update in two phases.
First phase - Intent Phase
DBMS gathers the information and other resources needed to
perform the update.
Makes no changes to database.
Second Phase - Commit Phase
Write commit flag to database.
DBMS make permanent changes.
If the system fails during second phase, the database may
contain incomplete data, but this can be repaired by performing
all activities of the second phase.
26
Error Detection and Correction Code
Shadow Fields
Recovery
Concurrency/consistency
Monitors
27
Parity checks.
Cyclic redundancy checks (CRC).
Hamming codes.
28
Copy of entire attributes or records.
Second copy can provide replacement.
29
Backup
Change log (Audit Log)
30
Simultaneous read is not a problem.
Modification requires one to be locked
out.
Query-update cycle treated as a single
uninterrupted operation.
31
Range Comparison
Tests each new value to ensure value is
within acceptable range.
Can be used to ensure internal
consistency of database.
State Constraints
Describes the condition of the entire
database.
Transition Constraints
Describes conditions necessary before
changes can be applied to database.
32
Definition
Data that should not be made public
Factorsthat make data sensitive
Inherently sensitive
From a sensitive source
Declared sensitive
Of a sensitive attribute or record
Sensitive in relation to previously
disclosed information
33
Availability of data
One or more elements may be inaccessible
When performing an update several fields or records
may have to be locked
Assurance of authenticity
Certain characteristics of user may be considered
DBA may permit user to access database during
certain hours
History of previous requests
Acceptability of access
One could extract non-sensitive statistic from
sensitive data
The administrator might want to consider the any
particular query unacceptable
34
Types of disclosure:
Exact Data
Disclosure of the exact value of the sensitive data itself
Bounds
Disclosure of lower bound y upper bound (L <= y <=H)
Context dependent
May be beneficial or may be harmful
Negative result
Disclosure of sensitive data disguised as innocent request
Existence
Disclosure of the existence of the data itself may be
sensitive
Probable value
Disclosure permits one to determine the probability that a
certain element has a certain value
35
Secrecy
Disclose only data that is not sensitive
Conservative approach says to reject any
query which mentions a sensitive field
May reject reasonable and non disclosing
queries
Precision
Protect all sensitive data while revealing
as much non-sensitive data as possible
36
Inference problem:
Deriving sensitive data from non-sensitive data
Example:
Sensitive fields in the database:
•Aid-amount of financial aid a student is receiving
•Fines-parking fines still owed (in dollars)
•Drugs-survey result 0- never used 3- frequently used
37
One tries to determine values of sensitive fields by seeking them
directly with queries that yield few records
Example
Obvious
List Name Where (Sex = M ^ Drugs = 1)
Less obvious
List Name Where
(Sex = M ^ Drugs = 1) v (Sex = M ^ Sex = F) v (Dorm = Ayers)
Organizations do not reveal results where a small number of
people make up a large proportion of the category.
38
Sum
An attack by sum tries to infer a value from a
reported sum
Count
Median
Tracker attacks
Tracker adds additional records to be
retrieved for two different queries
Linear System Vulnerability
It may be possible to determine a series of
queries that returns results relating to
several different sets
The queries form a set of linear equations
39
Suppression
Sensitive data values are not provided
Query is rejected without response
All results are correct but responses may be withheld in
order to maintain security
Concealing
Answer provided is close to but not the exact value.
Responses are provided but accuracy is lower.
Limited response suppression
N item, k percent rule eliminates certain low frequency
elements from being displayed
40
Combining results
Combine rows or columns to protect
sensitive values
Example:
Random sample
Random data perturbation
Query analysis
41
Threebasic paths to controlling the
inference problem:
Suppress obviously sensitive information
Track what the user knows
Disguise the data
42
Three characteristics of database security:
The security of a single element may differ from
the security of other elements of the same
record or from values of the same attribute
(implies security should be implemented for
individual elements).
Several grades of security may be needed and
may represent ranges of allowable knowledge,
which may overlap. Typically, the security grades
form a lattice.
The security of an aggregate may differ from the
security of the individual elements (may be
higher or lower).
Granularity
Fairly easy to classify and track
a single sheet of paper - a paper file
a computer file
a single program
Every combination of elements in a database may also
have a distinct sensitivity
The combination may be more or less sensitive than
any of its individual elements
An access control policy must dictate which users may
have access to what data (each data element is
marked to show its access limitation)
A means is needed to guarantee that the value has not
been changed by an unauthorized person
In other words, there is a need for both secrecy and
integrity
Integrity
Recall the * - property for access control
states
A process that reads high level data is not allowed
to write down to a lower level data element
The * - property poses a problem when
applied to databases since the DBMS must
be able to read and write all records in the
database in order to:
Perform backup functions.
Scan the database for queries.
Reorganize the database to suit user's needs.
Update all records of the database
There are two choices:
The process cleared at a high level cannot write to
a lower level in accordance with the * - property.
The process must be a trusted process
Secrecy
Two different users operating at
two different levels of security
might get two different answers to
the same query.
As seen earlier in order to preserve
secrecy we may have to sacrifice
accuracy.
The multilevel nature can result in
unknowingly creating redundancies.
Partitioning
Layered Integrity
Implementation
Lock
Proposals
for Sensitivity
View Multilevel Lock
Security
Commutative
Filters
Encryption
Trusted
Front-End
47
The database is divided into separate
databases, each at its own security level
(sometimes known as atomization of the
database).
This destroys basic advantages of
databases:
Elimination of redundancy
Improved accuracy
Does not address the problem of high-
level user who needs to access some low-
level data to be combined with high-level
data
If sensitive data is encrypted, a user who
accidentally receives sensitive data cannot
interpret the data.
Not foolproof since the user can:
Mount a plaintext attack.
Substitute the encrypted form of his/her own
data
Solutions:
Use different encryption for each record and a
different key for each field
Cryptographically link fields of a record by using
a block chaining method (Cipher Block Chaining
(CBC), Cipher Feedback (CFB) etc).
A way to provide both integrity
and limited access for a database.
Method nicknamed 'spray paint'
since each element is painted with
a color which denotes its
sensitivity.
The color is maintained with the
element and not in a external
table.
Eachdata item consists of three
elements
- Data stored in plaintext for efficiency
Classification
unforgeable - so that a malicious
subject cannot create a new sensitivity
label for an element.
unique - so that a malicious subject
cannot copy a sensitive level from
another element.
concealed - so that a malicious
subject cannot even determine the
sensitivity level of an arbitrary object
Cryptographic Checksum
To be unique, it contains information
about:
the record
the field
the data element
A sensitivity lock is a combination of:
A unique identifier (record number)
The security level
Must not be able to identify two elements
having identical security levels just by
looking at the security portion of the
integrity lock.
As a result of the encryption, the lock's
contents, especially the security level, are
concealed.
Associated with one specific record
Protects the secrecy level of that record
Intention was to be able to use any (untrusted)
DBMS with a trusted procedure to handle access
control.
Trusted access controller:
Efficiency Drawbacks:
Extra storage required for sensitivity
label.
Processing time increased.
During query processing.
Label must be recomputed for each
value modified.
If the database itself is adequately
protected it may be possible for the
data values to be left in plaintext to
reduce processing time
A trusted front end (also known as
a guard) functions much like the
monitor we discussed while we
were studying operating system
security methods.
Many DBMSs built and put into use
without consideration for
multilevel security.
Interaction Sequence:
User identifies self to front-end; front-end
authenticates user's identity.
User issues a query to front-end.
Front-end verifies user's authorization to data.
Front-end issues query to database manager.
Database manager performs I/O access.
Database manager returns result of query to
front-end.
Front-end verifies validity of data via checksum
and checks classification of data against security
level of user.
Front-end transmits data to untrusted front-end
for formatting.
Untrusted front-end transmits data to user.
Interfaces with both the user and
database manager.
The filter reformats query such
that:
DBM does as much of the work as
possible, screening out many
unacceptable records.
Provides second screening to select
only data to which user has access.
Can be used at the record, attribute, or element
level
At the record level, the filter requests desired
data plus cryptographic checksum information;
it then verifies the accuracy and accessibility of
data to be passed to the user
At the attribute level, the filter checks whether
all the attributes in the user's query are
accessible to the user and, if so, passes the
query to the database manager. On return, it
deletes all fields to which the user has no
access rights
At the element level, the system requests
desired data plus cryptographic checksum
information. When this is returned it checks the
classification level of every element of every
record retrieved against the user's level
A subset of a database, containing exactly the
information that a user is entitled to access.
Can represent a single user's subset database,
so that all of a user's queries access only that
data.
Consist of relations on attributes.
The data presented to a user is obtained by
filtering the contents of the original database
Attribute withheld unless user is authorized to
access at least one element.
Record withheld unless user is authorized to
access at least one element.
Remaining elements that user is not authorized
access are replaced by UNDEFINED
Integrated with a trusted operating system to
form trusted database manager base
First level - Reference Monitor
Performs file interaction enforcing Bell-
LaPadula access controls.
Performs user authentication.
Second level
Performs basic indexing and computation
functions
Third Level
Translates views into the base relations of
the database
“The Act gives rights to individuals about
whom information is recorded on computer.”
“They may find out information about
themselves, challenge it if appropriate and
claim compensation in certain
circumstances.”
Users of personal data must be:
Open about the use made of data
Follow sound and proper practices
66
Personal data shall:
be obtained and processed fairly and lawfully
be held only for lawful purposes described in
register
be used/disclosed for those purposes
be adequate, relevant and not excessive in relation
to the purpose
be accurate and kept up to date
be held no longer than necessary
be secure
67
All
personal data must be registered.
Content
Data user’s name and address
The personal data that is held
The purposes for which the data is held
The sources for the data
The people to whom data may be supplied
Overseas countries to which the data may be
transferred
Criminal offense to not comply
68