0% found this document useful (0 votes)

35 views

Flash Cards

Database Management System (DBMS) is software that facilitates creation and maintenance of databases. It allows users to interact with databases through queries, which access data, and transactions, which may read and update values. The DBMS catalog stores metadata about the database structure. Program-data independence separates data access programs from physical storage details. Data abstraction presents users a conceptual view of data rather than storage details. The relational model represents data as mathematical relations with rows and columns. Database design aims to eliminate anomalies like insertion, deletion, and update anomalies. Normalization divides relations to eliminate functional dependencies that are not related to primary keys. Structured Query Language (SQL) is used to define, manipulate and control access to data in a

Uploaded by

Patek lyu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views

Flash Cards

Uploaded by

Patek lyu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 24

Mini-world

some part of the real world about which data is stored in a database

Database Management System (DBMS)

a software package/system to facilitate the creation and maintenance of a computerized
database

Queries vs. Transactions

Applications interact with a database by generating:

Queries: access different parts of data and formulate the result of the request

Transactions: may read some data and 'update' certain values or generate new data and store
that in the database

DBMS catalog
stores the description of a particular database (e.g. data structures, types and constraints)

Description is called meta-data

Program-Data independence
Structure of data files is stored in DBMS catalog separately from access programs

Allows changing data structures and storage organization without having to change the DBMS
access programs

Data abstraction
a data model is used to hide storage details and present the users with a conceptual view of the
database

Programs refer to the data model constructs rather than data storage details

Actors on the scene

database users who actually use and control the database content, and those who design,
develop and maintain database aplications

Workers behind the scene

Those who design and develop the DBMS software and related tools, and the computer systems
operators

Big Data
High-volume, high-velocity, and/or high-variety information assets that require innovative forms
of information processing for enhanced insight and decision making.

Data Model
A set of concepts to describe the structure of a database, the operations for manipulating these
structures, and certain constraints that the database should obey.

Categories of Data Models

Conceptual (high-level, semantic)

Logical (implementation, representatinal)

Physical (low-level, internal)

Self-Describing

Three-Schema Architecture
External Schema (end users)
Conceptual Schema (conceptual and logical data models)
Internal Schema (physical data model)

Supports characteristics of program-data independence and supporting multiple views of the

data

Logical Data Independence

The capacity to change the conceptual schema without having to change the external schemas
and their associated application programs.

Physical Data Independence

The capacity to change the internal schema without having to change the conceptual schema.

Types of database constraints

Inherent or Implicit
Schema-based or Explicit

Application Based or Semantic

Inherent/Implicit constraints
based on the data model itself (e.g. relational data model does not allow a list as a value for any
attribute)

Schema-based/Explicit constraints
expressed in the schema by using the facilities provided by the model (ex. max cardinality ratio)

Application based/semantic constraints

beyond the expressive power of the model and must be specified and enforced by application
programs

Design guidelines for relational databases

1. informally, each tuple in a relation should represent one entity or relationship instance

2. design a schema that does not suffer from insertion, deletion and update anomalies

3. relations sould be designed such that their tuples will have as few NULL values as possible

4. the relation should be designed to satisfy the lossless join condition

Lossless Join Condition

No spurious tuples generated by doing a natural join of any relations

Functional Dependencies
A FD holds if whenever 2 tuples have the same value for X they MUST have the same value for Y
i.e. If t1[X]=t2[X] then t1[Y]=t2[Y]

Given an instance of a relation can only conclude that an FD may exist or does not exist, can't
know for sure that condition holds in all cases

First Normal Form (1NF)

Disallows: composite attributes, multivalued attributes and nested relations (i.e. attribute values
cannot be lists)

considered to be part of the definition of a relaton

1NF normalization
Move the attributes violating 1NF to a new relation and associate the relations via keys (FK, PK)

Second Normal Form (2NF)

Every non-prime attribute A in R is fully functionally dependent on every key of R

i.e. a FD Y -> Z where removal of any attribute from Y means the FD does not hold anymore

Prime attribute
An attribute that is a member of a (candidate) key K

2NF Normalization
Move the attribute involved in a 2NF violation to a new relation, maintain in the original relation
the LHS attributes and associate them to the new relation via keys

Third Normal Form (3NF)

2NF + no non-prime attribute A in R is transitively dependent on the primary key

normalization process same as 2NF

Transitive functional dependency

a FD X -> Z that can be derived from 2 FDs X -> Y and Y -> Z

Creating a Schema (SQL)

CREATE SCHEMA <schema_name> AUTHORIZATION <value>;

Creating a data type (SQL)

CREATE DOMAIN <type_name> AS <data type>;

ex. CREATE DOMAIN SSN_TYPE AS CHAR(9);

Creating a relation (SQL)
CREATE TABLE <[schema_name.]table_name>
(<attribute_name><data_type><attribute_constraints>, ..., <table_constraints>, ...);

Primary key constraint (SQL)

on attribute: <attribute_name><data_type> PRIMARY KEY

on table: CONSTRAINT <constraint_name> PRIMARY KEY (<attribute_name>)

Secondary key constraint (SQL)

on attribute: <attribute_name><data_type> UNIQUE

on table: CONSTRAINT <constraint_name> UNIQUE(<attribute_name>)

Foreign key constraint (SQL)

on attribute: <attribute_name> REFERENCES <table_name>(<referenced_attribute_name>)

on table: CONSTRAINT <constraint_name> FOREIGN KEY(<attribute_name>) REFERENCES

<table_name>(<referenced_attribute_name>) ON DELETE <delete_instructions>

Default operations (ON DELETE/ ON UPDATE)

SET NULL

CASCADE (suitable for relationship relations)

SET DEFAULT

Additional constraints using check (SQL)

on attribute: <attribute_name><data_type> CHECK (<constraint>)

on table: CONSTRAINT <constraint_name> CHECK (<constraint>)

ex. CHECK (Salary >= 0 AND Salary <= 100000)

The DROP command

used to drop named schema elements such as tables, domains or constraints
ex. DROP TABLE company.employee CASCADE;

DROP SCHEMA company CASCADE;

The ALTER TABLE command

actions include: assing/dropping a column (attribute), changing a column definition,
adding/dropping constraints

ex. ALTER TABLE company.employee ADD COLUMN job VARCHAR (12);

ALTER TABLE company.employee DROP CONSTRAINT fk_employee_supervisor;

ALTER TABLE department ALTER COLUMN mgr_ssn SET DEFAULT '123456789'; (can also drop
default or set not null)

Commands for modifying the database

INSERT (for inserting tuples into a relation)

UPDATE (for updating tuples that satisfy the condition)

DELETE (removes tuples that satisfy a condition)

The INSERT command

INSERT INTO <table_name> (<attribute1_name>, ..., <attributen_name>) VALUES
(<attribute1_value>, ..., <attributen_value>)

if attribute is not listed, will be set to default

Can nest SELECT function as value

Specifying Sequences
CREATE SEQUENCE <sequence_name> START <start_value> INCREMENT <increment_value>;

to set value as next number in sequence: nextval(<sequence_name>)

SELECT statement
SELECT <attribute and function list> FROM <table> [WHERE <condition>] [GROUP BY <grouping
attributes>] [HAVING <group condition>] [ORDER BY <attribute list>];
Logical operators
=, >, >=, <, <=, <>

BETWEEN operator
BETWEEN <value1> AND <value2>

ex. WHERE salary BETWEEN 30000 AND 40000

LIKE comparison operator

% replaces an arbitrary number of characters, _ replaces a single character

ex. WHERE Address LIKE '%Houston%'

WHERE SSN LIKE '_18901'

Aliasing
Used to shorten query when referring to same table more than once

Ex. SELECT e.name FROM Employee AS e ...;

Can also rename attributes Ex. SELECT ... FROM Employee AS e(fn, mi, ln)...;

Join condition
<table1> JOIN <table2> ON <condition>

can nest join conditions

ex. SELECT e.* FROM employee AS e JOIN department AS d ON e.dno = d.dnumber

Aggregate functions
Can be used in SELECT or HAVING clause

COUNT, SUM, MAX, MIN, AVG, STDDEV_POP

nulls are discarded

ORDER BY clause
Keywords: DESC, ASC

Ex. SELECT ... ORDER BY d.dname DESC, e.lname ASC;

Eliminating duplicate tuples in query results

Use the keyword DISTINCT in the SELECT clause i.e. SELECT DISTINCT e.ssn ...;

Comparison operatiors for nested queries

IN (R)
value theta ANY (R)
value theta ALL (R)
EXISTS
NOT EXISTS

Comparisons involving NULL

operations involving null return NULL (ex. NULL + 1 = NULL)

comparisons involving NULL return UNKNOWN

the CASE statement

allows conditional instructions
CASE
WHEN 'cond1' THEN 'result1'
...
[ELSE 'resultN']
END ...;

ex. SELECT fname, CASE (sex) WHEN 'M' THEN 'Male', ELSE 'Female' END FROM employee;

UPDATE employee SET salary = (CASE WHEN dno=5 THEN salary1.15 ELSE salary 1.3 END);

The DELETE clause

DELETE FROM <table> WHERE <condition>;

to delete all: DELETE FROM <table>;

the UPDATE clause
UPDATE <table> SET <attribute> = <value or function> WHERE <condition>

Views in SQL
single table derived from other tables

CREATE VIEW <view_name>(<attribute_list>) AS SELECT ...;

DROP VIEW disposes of a view

Once defined, can be referenced as a table in queries

Computed Views
the DBMS stores the view definition and executes the view query every time the view is used
(always up to date; updated automatically when parent tables are updated)

zero maintenance but does not increase performance

Materialized view
CREATE MATERIALIZED VIEW...

the DBMS stores the view definition, executes the query and stores the result as system
controlled table

increases performance but requires system to update view to reflect updates in base tables

Maintenance of materialized views

Immediate update: updates view as soon as base tables are changed

Lazy Update: updates view when needed by a view query

Periodic update

Derived Tables
Used for bulk-loading of several tuples into a table that satisfy a condition

CREATE TABLE <table_name> LIKE <base_table_name> (SELECT ...) WITH DATA;

does not maintain association with base tables

Assertions
CREATE ASSERTION <assertion name> CHECK (<condition, can include SELECT statement>);

Triggers
CREATE TRIGGER <name> {BEFORE | AFTER} [event [OR ...]] ON <table_name> [FOR [EACH] {ROW
| STATEMENT}] [WHEN (condition)] EXECUTE (function);

Main programming approaches

Programming into the DBMS Server

Using embedded SQL

Using a library of database functions

Using a persistence framework

Impedance Mismatch
Differences between database model and programming language model

Server-side programming
Allows implementing complex operations using a programming language supported by the DBMS,
operations executed in the DBMS server, can store procedures

Libraries of database functions

use libraries or APIs provided for the host language to access database

JDBC (Java Database Connectivity) - Driver

Allows a JAVA program to connect to several different databases

JDBC (Java Database Connectivity) - connection object

encapsulates a database connection

JDBC (Java Database Connectivity) - statement object

used to interact with the database through an opened connection

JDBC (Java Database Connectivity) - ResultSet object

holds results of query

Database programming Approach (Pros/Cons)

P: does nto suffer from impedance mismatch problem

C: programmers must learn a new language

Embedded SQL approach (P/C)

P: query text checked for syntax and validated against database schema at compile time

C: for complex applications where queries have to be generated at runtime, function call
approach more suitable

Library of function calls approach (P/C)

P: more flexibility, more complex programming

C: no checking of syntax done at compile time

Persistence frameworks approach (P/C)

P: encapsulates database operations and implement ORM functionalities

C: can be limited or deliver poor performance for data-intensive or complex operations

Double Buffering
Used to read continuous stream of blocks.

Use of 2 buffers, A and B, for reading from disk. While one buffer is beinf filled the other is
consumed

Buffer management information

Pin count (to pin a frame in the buffer pool)

Dirty bit (to indicate a modified block)

Buffer replacement strategies
To free frames in the buffer pool. Strategies:
Least recently used (LRU)
First-in-first-out (FIFO)

Record (placing records in files)

collection of related data values or items. Values correspond to record field (ex. row in a table)

Allocated to disk blocks

Binary large objects (BLOBs) (placing records in files)

Unstructured objects ex. image file

Allocated to disk blocks

Unspanned records
records that fit in a block; not allowed to cross block boundaries

Spanned records
records larger than a single block; pointer at end of first block points to block containing
remainder of record

Retrieval operations on files

no change to file data; blocks are loaded in the buffer pool

ex. Open, scan, find, read, FindNext, Close

Update operations on files

file change by insertion, deletion or modification.

Updates are made in the buffer pool, turning the dirty bit trie, and stored in disk after the
transaction commits
ex. Delete, Insert

Heap (or pile) file

(file organization)
records placed in file order of insertion

inserting a new record is very efficient, searching for a record requires linear search

Ordered (sequential) file

(file organization)
records sorted by ordering field

reading records in order of ordering key value is extremely efficient, binary search technique,
updates require reorganizing file

Indexing
Used to speed up record retrieval. Index structures provide secondary access paths.

Multiple indexes can be constructed, may be unique or non-unique

Clustering Index
specified on the ordering key field of ordered file records

One clustering index per data file

Creating a Clustering Index

CREATE [UNIQUE] CLUSTERING INDEX <index_name> ON table(attribute list);

Sorts the data file and maintains it ordered

Secondary index
can be specified on any nonordering field. A data file can have several secondary indexes

CREATE [UNIQUE] INDEX <index_name> ON table(attribute list);

The B+ -Tree Dynamic Multilevel Index

Disk-based search tree. Nodes have the size of a block.

Given 2 search keys, Ki-1 < Ki, the elements stored in the corresponding subtree are Ki-1 < X <= Ki

Tree pointers are pointers to blocks: data file + block

Record pointers (rids) have the structure: data file + block + record

B+ Tree Index - Balanced Tree

reorganized at each insert or delete using split and unsplit of nodes. Nodes are also at least half
full, except the root node

Bottom-up construction

B+ tree Index - Leaf Nodes

store search keys and data pointers (rids). For a non-unique search field, the pointer points to list
of pointers to the data file records. Store pointers to the next and to the previous leaf nodes

B+ Tree Index - Internal Nodes

store search keys and tree pointers. Some search field values from the leaf nodes repeated to
guide search

B+ Tree Insertion
start a search based on the key of the entry being inserted from the root until reach a leaf node.
If there is room in the leaf node for the new entry, add the entry and end the insert. If the leaf
node is full, execute a split

B+ Tree Insertion - Split

Create a new node
Divide the entries between the current node and the new node
Promote the median element to the parent node
The split can be propagated up to the root node, increasing the height of the tree

B+ Tree Search
if the search is an interval, set the search key as the lower bound of the interval (for a point query,
lower and upper bounds are equal)

Start a search based on the search key from the root until a leaf node. Retrieve the record
pointers of the entries from the search key (lower bound) to the upper bound, traversing the
leaves via the pointer to the next leaf

B+ Tree Deletion
Start a search based on the key to be deleted from the root until a leaf node. Delete from the leaf
node the entry that matches the search key.

If the leaf node becomes less than half full try to borrow an entry from a sibling leaf node
otherwise execute an unsplit

B+ Tree Deletion- Unsplit

Merge the entries of 2 sibling nodes in one of these nodes
Delete the other node
Propagate the operation to remove from the parent node the entry to the deleted node
The split can be propagated up to the root node, decreasing the height of the tree

Hash Indexes
Based on a hashing function. Uses hashing on a search key other than the one used for the
primary data file organization. Suitable for point queries, not range queries

Domain-Specific Indexes
Spatial: for spatial based queries (ex. queries on maps) ex. R-tree, kd-tree

Full-text search: for keyword-based search in text attributes ex. inverted indexes

Transaction
Describes local unit of database processing. An executing process that includes one or more
database access operations.

Characteristic operations: Reads, writes

OLTP (online transaction processing)

Large multi-user database systems supporting thousands of concurrent transactions per minute.
Require high availability and fast response time

Transaction processing model - basic operations

Granularity (size) of each data item is immaterial

Basic operations:
read_item(X) - reads a database item X into a program variable named X

write_item(X) - writes the value of program X into the database item named X
Read operation steps
Find the address of the disk block that contains item X
Copy that disk block into a buffer in main memory (if that disk block is not already in some main
memory buffer)
Copy item X from the buffer to the program variable named X

Write operation steps

Find the address of the disk block that contains item X
Copy that disk block into a buffer in main memory (if it is not already in a main memory buffer)
Copy item X from the program variable named X into its correct location in the buffer
Store the updated block from the buffer back to disk (either immediately or at some later point in
time)

Trasaction boundaries
Begin_transaction and End_transaction

Application program may include specification of several transactions separated by Begin and End
transaction boundaries

Transaction end states

Commit: transactio successfully completes and its results are committed (made permanent)

Abort: transaction does not complete and none of its actions are reflected in the database

Transaction Notation
Ti specifies a unique transaction identifier

wi(Y) means transaction Ti writes out the value for data item Y

ri(Y) means transaction Ti reads the value for data item Y

ci means transaction Ti committed

ai means transaction Ti aborted

Interleaved Processing (modes of concurrency)

concurrent execution of processes is interleaved on a single CPU
Parallel Processing (modes of concurrency)
Processes are concurrently executed on multiple CPUs

Schedule
Sequence of interleaved operations from several transactions

ACID Properties
Atomicity, Consistency, Isolation, Durability

Atomicity
A transaction is an atomic unit of processing; it is either performed in its entirety or not
performed at all

Ensured by the recovery system

Consistency
A correct execution of the transaction must take the database from one consistent state to
another

Responsibility of the database constraint system

Isolation
Even though transactions are executing concurrently, they should appear to be executed in
isolation i.e. their final effect should be as it each transaction was executed in isolation from start
to finish

Responsibility of the concurrency control mechanism

Durability
Once a transaction is committed, its changes (writes) applied to the database must never be lost
because of subsequent failure

Ensured by the recovery system

Lost Update Problem

occurs when 2 transactions update the same data item, but both read the same original value
before update

Dirty Read Problem

Occurs when a transaction T2 reads a database item that was updated by another uncommitted
transaction T1 but then T1 aborts, invalidating the value that T2 read

Inconsistent Read Problem

A concurrency problem where the data is read in the middle of an update and is incorrect or not
current

Unrepeatable read problem

Occurs when one transaction updates a database item, which is read by another transaction both
before and after the update

Serial Schedule
A schedule S is serial if no interleaving of operations from several transactions (i.e. for every
transaction T, all the operations are executed consecutively)

Any serial schedule will produce a correct result

Problems with serial schedules

Long transactions force other transactions to wait, when a transaction is waiting for disk I/O or
any other event, system cannot switch to other transaction

Solution: allow some interleaving, without sacrificing correctness

Serializable schedule
A schedule equivalent to some serial schedule. It will leave the database in a consistent state.

Interleaving such that: transactions see data as if they were serially executed, transactions leave
DB state as if they were serially executed, efficiently achievable through concurrent execution

There are n! serial schedules for n transactions

Conflict Serializability
A schedule S with n transactions is conflict serializable if it is conflict equivalent to some serial
schedule of the same n transactions i.e. relative order of any 2 conflicting operations is the same
in both schedules

Conflicting Operations
Two operations are conflicting in a schedule if:
They belong to different transactions, they access the same item X and at least one of the
operations is a write_item(X)

Concurrency control protocols - two-phase locking protocols

lock data items to prevent concurrent access

Concurrency control protocols - timestamp ordering protocols

assgn a unique identifier for each transaction, apply rules to control how transactions access
items according to the timestamps

Concurrency control protocols - multiversion techniques

kept several versions of an item, accept some read operations that would be rejected in other
techniques by reading an older version of the item while maintaining serializability

Concurrency control protocols - optimistic techniques

perform no checking while the transaction is executing, execute a validation phase to check
whether any of the transactions updates violate serializability, and commit or abort transactions
based on result

Database Locks
variable associated with a data item describing status for operations that can be applied. One
lock for each item in the database

Locking operations
read_lock(X) - shared lock, required for reading
write_lock(X) - exclusive lock, required for writing
unlock(X)

Two-phase locking protocol (2PL) phases

All locking operations preceds the first unlock operation in the transaction
Two phases:
1. expanding (growing): new locks can be acquired but none can be released. Lock conversion
upgrades (read_lock(X) -> write_lock(X)) must be done during this phase
2. Shrinking phase: existing locks can be released but none can be acquired. downgrades
(write_lock(X) -> read_lock(X)) must be done during this phase

2PL potential problems

Deadlock: each transaction is waiting for some item locked by some other transaction (solution:
detect deadlock and select one of the transactions incolved to abort)

Starvation: occurs if a transaction cannot proceed for an indefinite period of time while other
transactions continue normally (solution: first come first serve queue)

Conservative 2PL
requires a transaction to lock all the items it accesses before the transaction begins

Deadlock free protocol

Strict 2PL
transaction does not release any of its exclusive locks (write) until after it commits or aborts

prevents dirty reads

Rigorous 2PL
transaction does not release any of its locks (read or write) until after it commits or aborts

prevents dirty reads

Purpose of database recovery

to bring the database into the most recent consistent state that existed prior to a failure

The UNDO-REDO approach

After a crash:
1. UNDO transactions that had not committed to ensure Atomicity (partial results discarded)
2. REDO committed transactions to ensure durability

uses a system log with write-ahead logging policy

Append-only file (System log)

keep track of all operations of all transactions in the order in whcih operations occurred

Stored-on disk (System log)

persistent except for disk or catastrophic failure, periodically backed up, guard against disk and
catastrophic failures

Main memory buffer (System log)

holds records being appended, occassionally whole buffer appended to end of log on disk

System log entries

[start_transaction, T]
[write_item, T, X, old_value, new_value] - T has changed X from old value to new
[commit, T]
[abort, T]

Write-Ahead Logging (WAL)

used to ensure that the log is consistent with the database and to ensure that the log can be used
to recover the database to a onsistent state

WAL rules
log record for a page must be written before corresponsing page is flushed to disk (for atomicity
so each operation is known and can be undone)

all log records must be written before commit (for durability so the effect of a committed
transaction is known)

WAL Commit Point

A transaction is said to be committed when: all of its operations are executed and all its log
records are flushed to disk

UNDO process
Scan log from tail to head (backward in time)
Create a list of committed transactions
create a list of rolled-back transactions
undo updates of active transactions
restore before image
append[undo] record to log (in case of crash during recovery)
REDO process
scan the log from head to tail (forward in time)
redo updates of committed transactions
use after image for new values

Required only for log records after checkpoint record

Query Parsing
Scanner identifies query tokens
Parser checks the query syntax
validation checks all attribute and relation names

Query Tree
represents a specific order of operations for executing a query

Leaves are the input relations, nodes are the operators, bottom-up execution

Algorithms for query operators

query operators have several alternative algorithms to execute the operator. Optimization
incolves selecting the cheapest algorithm (cost associated with number of disk accesses or
number of blocks accessed)

SELECT algorithms
Sequential scan: cost = b (number of blocks of the data file)

Index scan: cost = x+s (x = number of levels of the index, s = cardinality of the selection i.e.
number of tuples that satisfy the select condition)

PROJECT Algorithm
Scan the input and generate projected tuples as output, cost = b

ORDER BY and DISTINCT Algorithm

sort (and eliminate duplicates), cost = b log b

Aggregates and GROUP BY Algorithm

Scan the input relation, build the groups if needed and compute the aggregates, cost = b log b

JOIN Algorithms - Nested Loops Join

Iterate over all tuples of the 2 input relations and produce as output tuples that are th
concatenation od the input tuples satisfying the join condition, cost = bR * bS

The inner relation may be accessed through an index on the join attributes if such an index exists,
cost = bR + rR * (cost of the index search on relation S), where rR is the number of records of the
outer relation R

JOIN Algorithms - Merge Join

Sort the 2 input relations each according to the join attribute and merge the tuples satisfying the
join condition, cost = bR + bS + (cost for sorting both R and S)

JOIN Algorithms - Hash Join

Hash the 2 relations into buckets using the same hash functions over the respective join
attributes. Join the tuples of corresponding buckets using the nested loops join algorithm, Cost =
3 (bR + bS)

Materialization
Creating, storing and passing temporary results. Required for pipelinine blocking operators when
the temporary input relations do not fit in the available memory buffers

Pipelining
combines several operations into one, avoids writing temporary files

Logical Optimization (Query Optimization)

Query rewrite: reorganization of the query operators using reqrite rules to generate equivalent
query trees

Physical optimization (query optimization)

selection of the algorithms for the query operators that produce the smallest tree cost based on
the query estimates

General Optimization Approach (Algebraic)

1. split select operators with a composite condition into a cascade of selects
2. push down selections
3. convert cartesian products followed by selections into joins
4. define an efficient join order
5. introduce projections

80-20 rule
80% of processing accounted for by only 20% of queries and transactions

Tuning Indexes when Queries take too long to run

index attributes frequently used in search conditions

index primary key and foreign key attributes

create an additional index on associative tables

Tuning indexes when transactions take too long

drop indexes that may not get utilized in queries

drop indexes that may undergo too much updating if based on an attribute that undergoes
frequent changes

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6435)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (641)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1173)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (997)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1854)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (650)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1267)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4102)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (628)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1018)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (581)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (297)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1138)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5143)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (943)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Tóibín
3.5/5 (2126)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (463)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (279)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4360)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2788)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2010)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2876)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4088)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (835)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (919)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2546)
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)

Flash Cards

Uploaded by

Flash Cards

Uploaded by

Mini-world

Database Management System (DBMS)

Queries vs. Transactions

Description is called meta-data

Actors on the scene

Workers behind the scene

Categories of Data Models

Logical (implementation, representatinal)

Physical (low-level, internal)

Supports characteristics of program-data independence and supporting multiple views of the

Logical Data Independence

Physical Data Independence

Types of database constraints

Application Based or Semantic

Application based/semantic constraints

Design guidelines for relational databases

4. the relation should be designed to satisfy the lossless join condition

Lossless Join Condition

First Normal Form (1NF)

considered to be part of the definition of a relaton

Second Normal Form (2NF)

Third Normal Form (3NF)

normalization process same as 2NF

Transitive functional dependency

Creating a Schema (SQL)

Creating a data type (SQL)

ex. CREATE DOMAIN SSN_TYPE AS CHAR(9);

Primary key constraint (SQL)

on table: CONSTRAINT <constraint_name> PRIMARY KEY (<attribute_name>)

Secondary key constraint (SQL)

on table: CONSTRAINT <constraint_name> UNIQUE(<attribute_name>)

Foreign key constraint (SQL)

on table: CONSTRAINT <constraint_name> FOREIGN KEY(<attribute_name>) REFERENCES

Default operations (ON DELETE/ ON UPDATE)

CASCADE (suitable for relationship relations)

Additional constraints using check (SQL)

on table: CONSTRAINT <constraint_name> CHECK (<constraint>)

ex. CHECK (Salary >= 0 AND Salary <= 100000)

The DROP command

DROP SCHEMA company CASCADE;

The ALTER TABLE command

ex. ALTER TABLE company.employee ADD COLUMN job VARCHAR (12);

ALTER TABLE company.employee DROP CONSTRAINT fk_employee_supervisor;

Commands for modifying the database

UPDATE (for updating tuples that satisfy the condition)

DELETE (removes tuples that satisfy a condition)

The INSERT command

if attribute is not listed, will be set to default

Can nest SELECT function as value

to set value as next number in sequence: nextval(<sequence_name>)

ex. WHERE salary BETWEEN 30000 AND 40000

LIKE comparison operator

ex. WHERE Address LIKE '%Houston%'

WHERE SSN LIKE '___1__8901'

Ex. SELECT e.name FROM Employee AS e ...;

can nest join conditions

ex. SELECT e.* FROM employee AS e JOIN department AS d ON e.dno = d.dnumber

COUNT, SUM, MAX, MIN, AVG, STDDEV_POP

nulls are discarded

Ex. SELECT ... ORDER BY d.dname DESC, e.lname ASC;

Eliminating duplicate tuples in query results

Comparison operatiors for nested queries

Comparisons involving NULL

comparisons involving NULL return UNKNOWN

the CASE statement

The DELETE clause

to delete all: DELETE FROM <table>;

CREATE VIEW <view_name>(<attribute_list>) AS SELECT ...;

DROP VIEW disposes of a view

Once defined, can be referenced as a table in queries

zero maintenance but does not increase performance

Maintenance of materialized views

Lazy Update: updates view when needed by a view query

CREATE TABLE <table_name> LIKE <base_table_name> (SELECT ...) WITH DATA;

Main programming approaches

Using embedded SQL

WHERE SSN LIKE '_18901'