UNIT II - New
UNIT II - New
TABLE: An arrangement of words, numbers, or signs, or combinations of them, as in parallel columns, to exhibit a set of
facts or relations in a definite, compact, and comprehensive form; a synopsis or scheme.
|
Relational Model
A relational model is one of the database models that use tables to store the data in a simple way. Its simplicity is
due to the usage of tables in which an entity is represented as a table and its instance by the rows of the table
(tuples).
Example: A student’s entity is represented as table whereas an individual student corresponds to the rows in the
table. A relation in a relational model consists of
Relation Schema
A relation schema contains the basic information of a table or relation. This information includes the name of the
entire table, the names of the column and the data types associated with each column.
For Example: A relation schema for the relation called students could be expressed using the following
representation.
Students (sid : string, name: string, login : string, age : integer, gpa : real)
A1, A2, …, An are attributes
R = (A1, A2, …, An ) is a relation schema
E.g. Customer-schema =
(customer-name, customer-street, customer-city)
r(R) is a relation on the relation schema R
E.g. customer (Customer-schema)
Relation Instance
A relation instance is a set of rows that when combined together forms the schema of the relation.
It can be thought of as a table in which each tuple is a row, and all rows have the same number of fields.
Domain : Domain is synonymous with data type. Attributes can be thought of as columns in a table. Therefore, an attribute
domain refers to the database associated with a column.
Relation Cardinality: The relation cardinality is the number of tuples in the relation.
Relation Degree: The relation degree is the number of fields in the relation.
Tuples / Records : The rows of the table are also known as records or tuples.
BASIC STRUCTURE:
Formally, given sets D1, D2, …. Dn a relation r is a subset of D1 x D2 x … x Dn
Thus a relation is a set of n-tuples (a1, a2, …, an) where each ai Î Di
Example: if
SQL
Introduction to SQL
SQL is a standard computer language for accessing and manipulating databases.
What is SQL?
SQL stands for Structured Query Language
SQL allows you to access a database
SQL is an ANSI standard computer language
SQL can execute queries against a database
SQL can retrieve data from a database
SQL can insert new records in a database
SQL can delete records from a database
SQL can update records in a database
SQL is easy to learn
SQL is a Standard - BUT....
SQL is an ANSI (American National Standards Institute) standard computer language for accessing and manipulating database
systems. SQL statements are used to retrieve and update data in a database. SQL works with database programs like………..
Micro Soft Access, DB2, Informix, MS SQL Server, Oracle, Sybase, etc.
Unfortunately, there are many different versions of the SQL language, but to be in compliance with the ANSI standard, they
must support the same major keywords in a similar manner (such as SELECT, UPDATE, DELETE, INSERT, WHERE, and others).
Note: Most of the SQL database programs also have their own proprietary extensions in addition to the SQL standard!
The table above contains three records (one for each person) and four columns (LastName, FirstName, Address, and City).
SQL Queries
With SQL, we can query a database and have a result set returned. A query like this:
SELECT LastName FROM Persons
LastName
Hansen
Svendson
Pettersen
Note: Some database systems require a semicolon at the end of the SQL statement. We don't use the semicolon in our
tutorials.
The result
LastName FirstName
Hansen Ola
Svendson Tove
Pettersen Kari
Result
Syntax :
SELECT DISTINCT column_name(s) FROM table_name
"Orders" table
Company OrderNumber
Sega 3412
W3Schools 2312
Trio 4678
W3Schools 6798
Result
Company
Sega
W3Schools
Trio
W3Schools
Result:
Company
Sega
W3Schools
Trio
SQL WHERE Clause: The WHERE clause is used to specify a selection criterion.
The WHERE Clause
To conditionally select data from a table, a WHERE clause can be added to the SELECT statement.
Syntax
SELECT column FROM table WHERE column operator value
Opera
Description
tor
= Equal
BETW
Between an inclusive range
EEN
WHERE City='Sandnes'
"Persons" table
Result
Using Quotes
Note that we have used single quotes around the conditional values in the examples.
SQL uses single quotes around text values (most database systems will also accept double quotes). Numeric values should not
be enclosed in quotes.
For text values:
This is correct:
SELECT * FROM Persons WHERE FirstName='Tove'
This is wrong:
This is wrong:
Syntax
SELECT column FROM table WHERE column LIKE pattern
A "%" sign can be used to define wildcards (missing letters in the pattern) both before and after the pattern.
Using LIKE
The following SQL statement will return persons with first names that start with an 'O':
SELECT * FROM Persons WHERE FirstName LIKE 'O%'
The following SQL statement will return persons with first names that end with an 'a':
SELECT * FROM Persons WHERE FirstName LIKE '%a'
The following SQL statement will return persons with first names that contain the pattern 'la':
SELECT * FROM Persons WHERE FirstName LIKE '%la%'
Syntax
INSERT INTO table_name VALUES (value1, value2,....)
You can also specify the columns for which you want to insert data:
INSERT INTO table_name (column1, column2,...) VALUES (value1, value2,....)
Insert a New Row
This "Persons" table:
Rasmussen Storgt 67
SQL DELETE Statement:
The DELETE Statement
The DELETE statement is used to delete rows in a table.
Syntax
DELETE FROM table_name WHERE column_name = some_value
Person:
Delete a Row
"Nina Rasmussen" is going to be deleted:
DELETE FROM Person WHERE LastName = 'Rasmussen'
Result
SQL ORDER BY: The ORDER BY keyword is used to sort the result.
Sort the Rows: The ORDER BY clause is used to sort the rows.
Orders:
Company OrderNumber
Sega 3412
W3Schools 2312
Example
To display the company names in alphabetical order:
SELECT Company, OrderNumber FROM Orders ORDER BY Company
Result:
Company OrderNumber
Sega 3412
W3Schools 6798
W3Schools 2312
Example
To display the company names in alphabetical order AND the OrderNumber in numerical order:
SELECT Company, OrderNumber FROM Orders ORDER BY Company, OrderNumber
Result:
Company OrderNumber
Sega 3412
W3Schools 2312
W3Schools 6798
Example
To display the company names in reverse alphabetical order:
SELECT Company, OrderNumber FROM Orders ORDER BY Company DESC
Result:
Company OrderNumber
W3Schools 6798
W3Schools 2312
Sega 3412
Example
To display the company names in reverse alphabetical order AND the OrderNumber in numerical order:
SELECT Company, OrderNumber FROM Orders ORDER BY Company DESC, OrderNumber ASC
Result:
Company OrderNumber
W3Schools 2312
W3Schools 6798
Sega 3412
Notice that there are two equal company names (W3Schools) in the result above. The only time you will see the second
column in ASC order would be when there are duplicated values in the first sort column, or a handful of nulls.
Example
Use AND to display each person with the first name equal to "Tove", and the last name equal to "Svendson":
SELECT * FROM Persons WHERE FirstName='Tove' AND LastName='Svendson'
Result:
Example
Use OR to display each person with the first name equal to "Tove", or the last name equal to "Svendson":
SELECT * FROM Persons WHERE firstname='Tove' OR lastname='Svendson'
Result:
Example
You can also combine AND and OR (use parentheses to form complex expressions):
SELECT * FROM Persons WHERE (FirstName='Tove' OR FirstName='Stephen')AND LastName='Svendson'
Result:
SQL IN:
IN
The IN operator may be used if you know the exact value you want to return for at least one of the columns.
SELECT column_name FROM table_name WHERE column_name IN (value1,value2,..)
Example 1
To display the persons with LastName equal to "Hansen" or "Pettersen", use the following SQL:
SELECT * FROM Persons WHERE LastName IN ('Hansen','Pettersen')
Result:
SQL BETWEEN:
BETWEEN ... AND
The BETWEEN ... AND operator selects a range of data between two values. These values can be numbers, text, or dates.
SELECT column_name FROM table_name WHERE column_name BETWEEN value1 AND value2
Example 1
To display the persons alphabetically between (and including) "Hansen" and exclusive "Pettersen", use the following SQL:
SELECT * FROM Persons WHERE LastName BETWEEN 'Hansen' AND 'Pettersen'
Result:
IMPORTANT! The BETWEEN...AND operator is treated differently in different databases. With some databases a person with
the LastName of "Hansen" or "Pettersen" will not be listed (BETWEEN..AND only selects fields that are between and excluding
the test values). With some databases a person with the last name of "Hansen" or "Pettersen" will be listed (BETWEEN..AND
selects fields that are between and including the test values). With other databases a person with the last name of "Hansen"
will be listed, but "Pettersen" will not be listed (BETWEEN..AND selects fields between the test values, including the first test
value and excluding the last test value). Therefore: Check how your database treats the BETWEEN....AND operator!
Example 2
To display the persons outside the range used in the previous example, use the NOT operator:
SELECT * FROM Persons WHERE LastName NOT BETWEEN 'Hansen' AND 'Pettersen'
Result:
SQL Alias: With SQL, aliases can be used for column names and table names.
Column Name Alias:
The syntax is: SELECT column AS column_alias FROM table
Hansen Ola
Svendson Tove
Pettersen Kari
LastName FirstName
Hansen Ola
Svendson Tove
Pettersen Kari
Create a Table
CREATE TABLE table_name(column_name1 data_type,column_name2 data_type,...)
integer(size), int(size) Hold integers only. The maximum numbers of digits are specified in parenthesis.
smallint(size), tinyint(size)
decimal(size,d) Hold numbers with fractions. The maximum number of digits are specified in "size". The
numeric(size,d) maximum number of digits to the right of the decimal is specified in "d".
char(size) Holds a fixed length string (can contain letters, numbers, and special characters). The fixed
size is specified in parenthesis.
varchar(size) Holds a variable length string (can contain letters, numbers, and special characters). The
maximum size is specified in parenthesis.
Create Index
Indices are created in an existing table to locate rows more quickly and efficiently.
It is possible to create an index on one or more columns of a table, and each index is given a name.
The users cannot see the indexes; they are just used to speed up queries.
Note: Updating a table containing indexes takes more time than updating a table without, this is because the indexes also
need an update. So, it is a good idea to create indexes only on columns that are often used for a search.
A UNIQUE INDEX
Creates a unique index on a table. A unique index means that two rows cannot have the same index value.
CREATE UNIQUE INDEX index_name ON table_name (column_name)
A SIMPLE INDEX
Creates a simple index on a table. When the UNIQUE keyword is omitted, duplicate values are allowed.
CREATE INDEX index_name ON table_name (column_name)
Example
This example creates a simple index, named "PersonIndex", on the LastName field of the Person table:
CREATE INDEX PersonIndex ON Person (LastName)
If you want to index the values in a column in descending order, you can add the reserved word DESC after the column name:
CREATE INDEX PersonIndex ON Person (LastName DESC)
If you want to index more than one column you can list the column names within the parentheses, separated by commas:
CREATE INDEX PersonIndex ON Person (LastName, FirstName)
Drop a Database:
DROP DATABASE database_name;
TRUNCATE A TABLE
If there is no further use of records stored in a table & the structure has to be retained then the records alone can be deleted.
TRUNCATE TABLE table_name;
RE - USEAGE:
TRUNCATE TABLE table_name reuse storage;
ALTER TABLE: The ALTER TABLE statement is used to add or drop columns in an existing table.
1. ALTER TABLE table_name ADD (column_name datatype);
2. ALTER TABLE table_name DROP (COLUMN column_name);
3. ALTER TABLE table_name modify(column_definition1, column_definition2);
Person:
Example
To add a column named "City" in the "Person" table:
ALTER TABLE Person ADD(City varchar(30));
Result:
Example
ALTER TABLE Person DROP COLUMN Address;
Result:
Pettersen Kari
SQL Functions: SQL has a lot of built-in functions for counting and calculations.
Function Syntax
SELECT function(column) FROM table
Types of Functions
There are several basic types and categories of functions in SQL. The basic types of functions are:
Aggregate Functions
Scalar functions
Aggregate functions
Aggregate functions operate against a collection of values, but return a single value.
Note: If used among many other expressions in the item list of a SELECT statement, the SELECT must have a GROUP BY
clause!!
Hansen, Ola 34
Svendson, Tove 45
Pettersen, Kari 19
Scalar functions
Scalar functions operate against a single value, and return a single value based on the input value.
GROUP BY:
GROUP BY... was added to SQL because aggregate functions (like SUM) return the aggregate of all column values every time
they are called, and without the GROUP BY function it was impossible to find the sum for each individual group of column
values. The syntax for the GROUP BY function is:
SELECT column,SUM(column) FROM table GROUP BY column
Example: This is "Sales" Table:
Company Amount
W3Schools 5500
IBM 4500
W3Schools 7100
Company SUM(Amount)
W3Schools 17100
IBM 17100
W3Schools 17100
The above code is invalid because the column returned is not part of an aggregate. A GROUP BY clause will solve this problem:
SELECT Company,SUM(Amount) FROM Sales GROUP BY Company
Company SUM(Amount)
W3Schools 12600
IBM 4500
HAVING...
HAVING... was added to SQL because the WHERE keyword could not be used against aggregate functions (like SUM), and
without HAVING... it would be impossible to test for result conditions.
The syntax for the HAVING function is:
SELECT column,SUM(column) FROM table GROUP BY column HAVING SUM(column) condition value
Company Amount
W3Schools 5500
IBM 4500
W3Schools 7100
This SQL:
SELECT Company,SUM(Amount) FROM Sales GROUP BY Company HAVING SUM(Amount)>10000
Company SUM(Amount)
W3Schools 12600
Syntax
SELECT column_name(s) INTO newtable [IN externaldatabase] FROM source
If you only want to copy a few fields, you can do so by listing them after the SELECT statement:
SELECT LastName,FirstName INTO Persons_backup FROM Persons;
You can also add a WHERE clause. The following example creates a "Persons_backup" table with two columns (FirstName and
LastName) by extracting the persons who lives in "Sandnes" from the "Persons" table:
SELECT LastName,Firstname INTO Persons_backup FROM Persons WHERE City='Sandnes';
Selecting data from more than one table is also possible. The following example creates a new table "Empl_Ord_backup" that
contains data from the two tables Employees and Orders:
SELECT Employees.Name,Orders.Product INTO Empl_Ord_backup FROM Employees INNER JOIN Orders
ON Employees.Employee_ID=Orders.Employee_ID;
3. QUERY LANGUAGES
Language in which user requests information from the database.
Categories of languages are ,
PROCEDURAL LANGUAGE (RELATIONAL ALGEBRA):
Six basic operators : select, project, union, set difference, Cartesian product and rename
The operators take two or more relations as inputs and give a new relation as a result.
1. Relational Algebra:
• Relational algebra is a procedural query language.
• It consists of a set of operations that take one or two relations as input and produce a new
relation as their result.
• the formal description of how a relational database operates
• an interface to the data stored in the database itself
• the mathematics which underpin SQL operations
Operators in relational algebra are not necessarily the same as SQL operators, even if they have
the same name. For example, the SELECT statement exists in SQL, and also exists in relational
algebra. These two uses of SELECT are not the same. The DBMS must take whatever SQL
statements the user types in and translate them into relational algebra operations before
applying them to the database.
Terminology
Set Operations:
The SQL operations union, intersect, and difference operate on relations and
corresponds to the relational-algebra operations U, and --. Like union, intersection, and set difference
in relational algebra, the relations participating in the operations must be compatible; that is, they must
have the same set of attributes. We shall now construct queries involving the union, intersect, and
except operations of two sets: The set of all customers who have an account at the bank, which can be
derived by
and the set of customers who have a loan at the bank, which can be derived by
We shall refer to the relations obtained as the result of the proceeding queries as d and b, respectively.
UNION of R and S : The union of two relations is a relation that includes all the tuples that are
either in R or in S or in both R and S. Duplicate tuples are eliminated.
INTERSECTION of R and S: The intersection of R and S is a relation that includes all tuples that
are both in R and S.
DIFFERENCE of R and S : The difference of R and S is the relation that contains all the tuples
that are in R but that are not in S.
1. UNION:
UNION Example
The union operation automatically eliminates duplicates, unlike the select clause. Thus, in the
preceding query, if a customer—say, Jones—has several accounts or loans (or both) at the bank, then
Jones will appear only once in the result.
If we want to retain all duplicates, we must write union all in place of union:
(Select customer-name from depositor) union all (select customer-name from borrower)
The number of duplicate tuples in the result is equal to the total number of duplicates that appear in
both d and b. Thus, if Jones has three accounts and two loans to the bank, then there will be five
tuples with the name Jones in the result.
2. INTERSECTION: C <= A ^ B
(select distinct customer-name from depositor) intersect (select distinct customer-name from
borrower)
The interest operation automatically eliminates duplicates. Thus, in the
preceding query, if a customer—say, Jones –has several accounts and loans at the bank, then Jones will
appear only once in the result.
If we want to retain all duplicates, we must write intersect all in place of intersect:
(select customer-name from depositor) intersect all (select customer-name from borrower)
The number of duplicate tuples that appear in the result is equal to the minimum
number of duplicates in both d and b. Thus, if Jones has three accounts and two loans at the bank,
then there will be two tuples with the name Jones in the result.
The Cartesian Product is also an operator which works on two sets. It is sometimes called the
CROSS PRODUCT or CROSS JOIN. It combines the tuples of one relation with all the tuples of
the other relation.
SELECTdob '01/JAN/1950'(employee)
6. Relational PROJECT :
The PROJECT operation is used to select a subset of the attributes of a relation by specifying
the names of the required attributes.
For example, to get a list of all employees surnames and employee numbers:
PROJECTsurname,empno(employee)
7. JOIN Operator : JOIN is used to combine related tuples from two relations:
In its simplest form the JOIN operator is just the cross product of the two relations.
As the join becomes more complex, tuples are removed within the cross product to make the
result of the join more meaningful.
JOIN allows you to evaluate a join condition between the attributes of the relations on which the
join is undertaken.
JOIN Example
Natural Join
Invariably the JOIN involves an equality test, and thus is often described as an equi-join. Such
joins result in two attributes in the resulting relation having exactly the same value. A `natural
join' will remove the duplicate attribute(s).
In most systems a natural join will require that the attributes have the same name to identify the
attribute(s) to be used in the join. This may require a renaming mechanism.
If you do use natural joins make sure that the relations do not have two attributes with the same
name by accident.
OUTER JOINs:
Notice that much of the data is lost when applying a join to two relations. In some cases this
lost data might hold useful information. An outer join retains the information that would have
been lost from the tables, replacing missing data with nulls.
There are three forms of the outer join, depending on which data is to be kept.
• A tuple variable is a variable that takes on tuples of a particular relation schema(table) as values.
That is, every value assigned to a given tuple variable has the same number and type of fields.
• A tuple relational calculus query has the form { T / p(T) }, where
• T is a tuple variable and p(T) denotes a formula that describes T; we will shortly define formulas
and queries rigorously.
Sailors:
Sid Rating
1 8
2 10
3 6
Find all sailors with a rating above 7. {T/T€Sailors ∩T.Rating>7}
• A domain variable is a variable that ranges over the values in the domain of some attribute.
• A DRC query has the form {<x1; x2; : : : ; xn>/p(x1; x2; : : : ; xn)},
• where each x is either a domain variable. x1; x2; : : : ; xn for which the formula evaluates to true.
Find all sailors with a rating above 7. {< Sid,Rating>/< Sid,Rating >ﻉSailors∩T>7}
AGGREGATE FUNCTIONS:
Aggregate functions are functions that take a collection (a set of multiset) of values as input and return
a single value. SQL offers five built-in aggregate functions:
Average: avg
Minimum: min
Maximum: max
Total: sum
Count: count
The input to sum and avg must be a collection of numbers, but the other operators can operate on
collections of nonnumeric data types, such as strings, as well.
As an illustration, consider the query “Find the average account balance at the perryridge
branch.” We write this query as follows:
There are circumstances where we would like to apply the aggregate function not only to a single
set of tuples, but also to a group of sets of tuples; we specify this wish in SQL using the group by clause.
The attribute or attributes given in the group by clause are used to form groups. Tuples with the same
value on all attributes in the group by clause are placed in one group.
As an illustration, consider the query “Find the average account balance at each branch.” We
write this query as follows:
Retaining duplicates is important in computing an average. Suppose that the account balances
at the (small) Brighton branch are $1000, $3000, $2000 and $1000. The average balance is $7000/4 =
$1750.00. If duplicates were eliminated, we would obtain the wrong answer ($6000/3 = $2000).
VIEWS
We define a view in SQL by using the create view command. To define a view, we must given the view a
name and must state the query that computes the view. The form of the create view commands is
Where < query expression> is any legal query expression. The view name is represented by v. Observe
that the notation that we used for view definition in the relational algebra is based on that of SQL. As
an example, consider the view consisting of branch names and the names of customers who have either
an account or a loan at that branch. Assume that we want this view to be called all-customer. We
define this view as follows,
Create view all-customer as (select branch-name, customer-name from depositor, account where
depositor. Account-number=account. account-number) union (select branch-name, customer-name
from borrower, loan where borrower. loan- number = loan. loan-number);
The preceding view gives for each branch the sum of the amounts of all the loans at the branch. Some
the expression sum (amount) does not have a name, the attribute name is specified explicitly in the view
definition. View names may appear in any place that a relation name may be appear. Using the view all-
customer, we can find all customers of the perryridge branch by writing
Note: The database design and structure will NOT be affected by the functions, where, or join statements in a view.
Syntax
CREATE VIEW view_name AS SELECT column_name(s) FROM table_name WHERE condition;
Note: The database does not store the view data! The database engine recreates the data, using the view's SELECT statement,
every time a user queries a view.
Using Views
A view could be used from inside a query, a stored procedure, or from inside another view. By adding functions, joins, etc., to
a view, it allows you to present exactly the data you want to the user. The sample database Northwind has some views
installed by default. The view "Current Product List" lists all active products (products that are not discontinued) from the
Products table.
The view is created with the following SQL:
CREATE VIEW [Current Product List] AS SELECT ProductID,ProductName FROM Products WHERE
Discontinued=No;
Another view from the North wind sample database selects every product in the Products table that has a unit price that is
higher than the average unit price:
CREATE VIEW [Products Above Average Price] AS SELECT ProductName,UnitPrice FROM Products
Another example view from the Northwind database calculates the total sale for each category in 1997. Note that this view
selects its data from another view called "Product Sales for 1997":
CREATE VIEW [Category Sales For 1997] AS SELECT DISTINCT CategoryName, Sum(ProductSales) AS
CategorySales FROM [Product Sales for 1997] GROUP BY CategoryName;
We can also add a condition to the query. Now we want to see the total sale only for the category "Beverages":
SELECT * FROM [Category Sales For 1997] WHERE CategoryName='Beverages';